NCIt Subset Code NCIt Subset Name NCIt Concept Code NCIt PT pFDA PT NCIt Definition C188687 pFDA Bioinformatics/Genomics Terminology C153367 BED Format Browser Extensible Data A tab-delimited text file format that allows the specification of the sequence data that is displayed in an annotation track. The minimum required information is chromosome, start position, and end position. C188687 pFDA Bioinformatics/Genomics Terminology C153249 Binary Alignment Map Binary Alignment Map A binary representation of a sequence alignment map compressed by the BGZF library. C188687 pFDA Bioinformatics/Genomics Terminology C17964 Bioinformatics Bioinformatics Bioinformatics derives knowledge from computer analysis of biological data. These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It has many practical applications in different areas of biology and medicine. (M. Nilges and Jens P. Linge, Unite de Bio-informatique Structurale, Institut Pasteur, Paris) C188687 pFDA Bioinformatics/Genomics Terminology C116155 Biopolymer Sequencing Sequencing A process to identify and determine the primary structure of, and the order of constituents in a biopolymer. C188687 pFDA Bioinformatics/Genomics Terminology C188483 Cheminformatics Cheminformatics A branch of informatics focused on chemical data. C188687 pFDA Bioinformatics/Genomics Terminology C17961 DNA Methylation DNA methylation The process by which methyl groups are added to nucleotides in genomic DNA. C188687 pFDA Bioinformatics/Genomics Terminology C47845 FASTA Format FASTA A sequence in FASTA format consists of a single-line description, followed by lines of sequence data. The first character of the description line is a greater-than (">") symbol in the first column. Sequences are represented in the standard IUB/IUPAC single letter amino acid and nucleic acid codes, with a single hyphen or dash being used to represent a gap of indeterminate length; in amino acid sequences asterix ("*") can represent a translation stop. C188687 pFDA Bioinformatics/Genomics Terminology C153250 FASTQ Format FASTQ A text-based format for storing a biological sequence that encodes the nucleotide calls as well as their quality scores. C188687 pFDA Bioinformatics/Genomics Terminology C84343 Genomics Genomics The study of the structure, function, expression, evolution, mapping and editing of genomes. C188687 pFDA Bioinformatics/Genomics Terminology C45447 Genotyping Genotyping The determination of the DNA sequence of an individual. C188687 pFDA Bioinformatics/Genomics Terminology C99752 Indel Mutation INDEL A mutation class that includes insertion mutations, deletion mutations and mutation events where both an insertion and a deletion has occurred. C188687 pFDA Bioinformatics/Genomics Terminology C54683 International Chemical Identifier InChI A textual identifier for chemical substances designed to provide a standard and human-readable way to encode molecular information that also facilitates searches in printed and electronic data sources. C188687 pFDA Bioinformatics/Genomics Terminology C133910 MDL Molfile Format MOLFILE A chemical text file format developed by Molecular Design Limited (MDL) that represent information about molecular atoms, bonds, connectivity and coordinates. The file extension is .mol. C188687 pFDA Bioinformatics/Genomics Terminology C133996 MDL Structure-data File Format Structure Data File A family of chemical text file formats developed by Molecular Design Limited (MDL) that represent multiple chemical structural records and associated data fields. The file extension is .sd or .sdf. C188687 pFDA Bioinformatics/Genomics Terminology C153191 Metagenomics Metagenomics The direct study of genetic material recovered from environmental samples, largely dominated by microbial organisms. C188687 pFDA Bioinformatics/Genomics Terminology C101293 Next Generation Sequencing Next-Generation Sequencing Technologies that facilitate the rapid determination of the nucleotide sequence of large numbers of strands or segments of DNA or RNA. C188687 pFDA Bioinformatics/Genomics Terminology C153349 Nucleotide Sequence Read Sequencing read The manual or automated determination of the nucleotide order in a nucleic acid fragment obtained after the completion of a sequencing process. C188687 pFDA Bioinformatics/Genomics Terminology C20085 Proteomics Proteomics The global analysis of cellular proteins. Proteomics uses a combination of sophisticated techniques including two-dimensional (2D) gel electrophoresis, image analysis, mass spectrometry, amino acid sequencing, and bio-informatics to resolve comprehensively, to quantify, and to characterize proteins. The application of proteomics provides major opportunities to elucidate disease mechanisms and to identify new diagnostic markers and therapeutic targets. C188687 pFDA Bioinformatics/Genomics Terminology C188688 Sequence Alignment Alignment The process of arranging protein, DNA or RNA sequences to identify regions with similar sequences that may elucidate functional, structural, or evolutionary relationships between the sequences. C188687 pFDA Bioinformatics/Genomics Terminology C153248 Sequence Alignment Map Sequence Alignment Map A tab-delimited, text-based format for storing biological sequences aligned to a reference sequence. A SAM file includes an optional header section and an alignment section. Each alignment line has 11 mandatory fields for essential alignment information, such as mapping position, and a variable number of optional fields. C188687 pFDA Bioinformatics/Genomics Terminology C18279 Single Nucleotide Polymorphism Single-Nucleotide Polymorphisms A variation of a single nucleotide at a specific location of the genome due to base substitution, present at an appreciable frequency between individuals of a single interbreeding population. C188687 pFDA Bioinformatics/Genomics Terminology C129888 Single Nucleotide Polymorphism Profile SNP genotyping The analysis of all of the single nucleotide polymorphisms in the genome of a biological sample. C188687 pFDA Bioinformatics/Genomics Terminology C164674 Single Nucleotide Variant Single-Nucleotide Variant A variation of a single nucleotide at a specific location of the genome due to base substitution, which is found at any frequency in the population. C188687 pFDA Bioinformatics/Genomics Terminology C188689 Single Nucleotide Variant Genotyping SNV genotyping The measurement of genetic variations of single nucleotide variants (SNVs) between members of a species. C188687 pFDA Bioinformatics/Genomics Terminology C153189 Transcriptomics Transcriptomics A study of the complete set of RNA transcripts that are produced by the genome, under specific circumstances or in a specific cell. C188687 pFDA Bioinformatics/Genomics Terminology C172216 Variant Call File Format Variant Call Format A text-based electronic file used for storing gene sequence variation data. The first text section is composed of a header containing the metadata and keywords used in the file. This is followed by the body of the file which is tab-separated into eight mandatory data columns for each sample. Additionally, the body of the file can include an unlimited number of optional columns to record other sample-related data. C188687 pFDA Bioinformatics/Genomics Terminology C188690 Variant Calling Variant calling Technology that detects differences between an individual's DNA sequence and a reference DNA sequence. C188687 pFDA Bioinformatics/Genomics Terminology C101295 Whole Exome Sequencing Whole-exome sequencing A procedure that can determine the DNA sequence for all of the exons in an individual. C188687 pFDA Bioinformatics/Genomics Terminology C101294 Whole Genome Sequencing Whole-Genome Sequencing A procedure that can determine the DNA sequence for nearly the entire genome of an individual.