The Genomic Data commons GDC is an NCI resource provided to the cancer research community with a unified data repository that enables data sharing across cancer genomic studies in supprt of precision medicine. The GDC Termminology is a set of terms used in the GDC data dictionary and data model, which is durrently divided into two subsets that capture data properties and property values.
These two files are available for download from the NCI EVS ftp site:
GDC Property Terminology (Microsoft Excel 2003)
GDC Value Terminology (Microsoft Excel 2003)
Spreadsheet Column | Content Description |
---|---|
Subset Code | The NCI Thesaurus (NCIt) concept code attached to the GDC concept. NCIt Codes are unique strings that begin with a C and are followed by a series of digits. | Subset Name | The name of the GDC Terminology subset. |
Concept Code | The NCI Thesaurus (NCIt) concept code attached to the concept. |
NCI Preferred Term | The preferred term chosen by the NCIt staff that unambiguously describes the concept. |
NCI Definition | A text definition of the term created by subject matter experts at the NCIt. |
GDC Preferred Term | The label used in the GDC data model for the concept. |
GDC Synonyms | Terms that are synonymous with the GDC Preferred Term. |
Spreadsheet Column | Content Description |
---|---|
Subset Code | The NCI Thesaurus (NCIt) concept code attached to the GDC concept. NCIt Codes are unique strings that begin with a C and are followed by a series of digits. | Subset Name | The name of the GDC Terminology subset. |
Concept Code | The NCI Thesaurus (NCIt) concept code attached to the concept. |
NCI Preferred Term | The preferred term chosen by the NCIt staff that unambiguously describes the concept. |
NCI Definition | A text definition of the term created by subject matter experts at the NCIt. |
GDC Submission Value | GDC Property Preferred Term | The term used as the GDC submission value that has been mapped to the NCIt concept. This term is followed by a pipe (|) and then the GDC property preferred term associated with the GDC submission value is listed. If multiple GDC submission values are mapped to the same NCIt concept each value | property statement is separated by double pipes (||). |
Is Value for GDC Property Code | A pipe (|)-delimited list of codes for the GDC Property concepts this GDC Value concept is used for. |
Is Value for GDC Property Preferred Term | A pipe (|)-delimited list of GDC Preferred Terms for the GDC Property concepts this GDC Value concept is used for. |
Also included on the NCI EVS ftp site are the following additional files:
About (This file.)
Changes (Text files of changes between the most recent and the current version of the GDC terminology subsets
(GDC_Property_Terminology_Changes.txt and GDC_Value_Terminology_Changes.txt). For each change record, the Changes.txt
contains a complete row of tab delimited data with the same data elements as described above.
An "A" will precede any new concept additions, a "C" will precede any modification to existing concepts, and a "D" will
precede any concepts that have been deleted.)
Version (A text file that contains the version of NCI Thesaurus that
corresponds to the current spreadsheet data. The database is reconciled the last Monday of every month. The files will
be posted during the following two weeks. The version appears as YR.MOweek. An example is 21.06d which corresponds to the
year 2021, the month of June, and the "d" refers to the fourth Monday of the month.)
Archived files are available at: GDC Archive -Directory of dated release versions.
N.B.: If there are no changes to the data for a particular month, the files will not be reposted.
Help requests on these files should go to NCIThesaurus@mail.nih.gov
Shape
* If an attempt to view the Excel spreadsheet results in a page of nonsense characters, check the security settings in Excel to permit viewing. This is achievable by clicking the highlighted bar above the data, but below the menu bar in the spreadsheet.