precisionFDA (pFDA) Terminology Files

The NCIt-pFDA terminology files provided here support the cooperative efforts of the National Cancer Institute's Thesaurus (NCIt) and the members of the precisionFDA staff to develop terminology that to assist creating a secure, collaborative, high-performance computing platform that builds a community of experts around the analysis of biological datasets in order to advance precision medicine. The efforts are described more fully on the precisionFDA web page.

pFDA terminology files are available for download from this NCI EVS ftp site (https://evs.nci.nih.gov/ftp1/FDA/pFDA/) in two formats:

NCIt-pFDA_Terminology_Subsets.xls (Microsoft Excel)
NCIt-pFDA_Terminology_Subsets.txt (Tab-delimited text)

Each file has column headers on the first row:

Spreadsheet Column Content Description
NCIt Subset Code The NCIt concept code attached to the subset concept. NCIt Codes are unique strings that begin with a C and are followed by a series of digits.
pFDA Subset Name The name of the subset.
NCIt Concept Code The NCIt concept code attached to the concept. NCIt Codes are unique strings that begin with a C and are followed by a series of digits.
NCIt Preferred Term The NCIt preferred term attached to the concept.
pFDA Preferred Term The pFDA preferred term attached to the concept.
NCIt Definition A text definition of the term created by an NCI EVS subject matter expert.

pFDA terminology is bundled into subsets, identified by code and name in the first and second columns. These are the names and definitions of the subsets.

Subset Name Subset Description
pFDA Bioinformatics/Genomics Terminology Terms/codes for describing the bioinformatics and genomics ideas.

Also included on the NCI EVS ftp site (https://evs.nci.nih.gov/ftp1/FDA/pFDA/) are the following additional files:
About (This file)
Changes.txt (A text file of changes between the most recent and the current version of pFDA terminology. For each change record, the Changes.txt contains a complete row of tab delimited data with the same data elements as described above. An "A" will precede any new concept additions, a "C" will precede any modification to existing concepts, and a "D" will precede any concepts that have been deleted.)
Version.txt (A text file that contains the version of NCI Thesaurus that corresponds to the current spreadsheet data. The database is reconciled the last Monday of every month. The files will be posted during the following two weeks. The version appears as YR.MOweek. An example is 20.02d which corresponds to the year 2020, the month of February, and the "d" refers to the fourth Monday of the month.




N.B.: If there are no changes to the data for a particular month, the files will not be reposted.

Contact Information and Updates:

Archived files are available at:

https://evs.nci.nih.gov/ftp1/FDA/pFDA
Archive/ Directory of dated release versions.

Help requests on these files should go to NCIThesaurus@mail.nih.gov