Chemical Biology and Drug Development (CBDD) Terminology Files

The Chemical Biology and Drug Development (CBDD) Terminology is a set of terms used in design, development, and description of drug substances, mainly but not exclusively pertaining to the chemistry (including chemical reactions) of small molecules and polymers. The terminology is intended to support the chemoinformatics community's effort to develop conceptual models for describing modern drug design such as performed by the NCI Computer-Aided Drug Design (CADD) Group, part of the NCI Chemical Biology Laboratory and collaborators at NIH, academia, and other government agencies.

There are two CBDD terminology subsets, one focused on chemical processes (CBDD Process Terminology) and one focused on chemical structures (CBDD Structure Terminology). These subsets are available for doenload from this NCI EVS ftp site in two formats:
CBDD_Process_Terminology.xls (Microsoft Excel 2003)*
CBDD_Process_Terminology.txt (Tab-delimited text)
CBDD_Structure_Terminology.xls (Microsoft Excel 2003)*
CBDD_Structure_Terminology.txt (Tab-delimited text)

Each file has column headers on the first row:
Spreadsheet Column Content Description
Subset Code The NCI Thesaurus (NCIt) concept code attached to the CBDD concept. NCIt Codes are unique strings that begin with a C and are followed by a series of digits.
Subset Name The name of the terminology set.
Concept Code The NCI Thesaurus (NCIt) concept code attached to the concept.
NCIt Preferred Term The preferred term chosen by the NCIt staff that unambiguously describes the concept.
CBDD Preferred Term The preferred term chosen by CBDD for the concept.
CBDD Synonym(s) Terms chosen by CBDD that are synonymous with the Preferred Term.
NCI Definition A text definition of the term created by subject matter experts at the NCIt.
Semantic Type A categorization of the type of concept in the context of the UMLS semantic network.

Also included on the NCI EVS ftp site are the following additional files:

About (This file.)
Changes (A text file of changes between the most recent and the current version of CBDD terminology. For each change record, the Changes.txt contains a complete row of tab delimited data with the same data elements as described above. An "A" will precede any new concept additions, a "C" will precede any modification to existing concepts, and a "D" will precede any concepts that have been deleted.)
Version (A text file that contains the version of NCI Thesaurus that corresponds to the current spreadsheet data. The database is reconciled the last Monday of every month. The files will be posted during the following two weeks. The version appears as YR.MOweek. An example is 19.06d which corresponds to the year 2019, the month of June, and the "d" refers to the fourth Monday of the month.)
N.B.: If there are no changes to the data for a particular month, the files will not be reposted. Archived files are available at: • Archive Directory of dated release versions. Help requests on these files should go to NCIThesaurus@mail.nih.gov

* If an attempt to view the Excel spreadsheet results in a page of nonsense characters, check the security settings in Excel to permit viewing. This is achievable by clicking the highlighted bar above the data, but below the menu bar in the spreadsheet.