Editing of NCI Thesaurus 12.04e was completed on April 30, 2012.  Version 12.04e 
was April's fifth build in our development cycle.

This directory contains files:

	ReadMe.txt			This file
	Thesaurus_12.04e.FLAT.zip	The NCI Thesaurus 12.04e in flat file format
	Thesaurus_12.04e.OWL.zip	The NCI Thesaurus 12.04e in OWL
	Thesaurus_12.04e.OWL-byCode.zip	The NCI Thesaurus 12.04e in OWL byCode format
	ThesaurusInf_12.04e.OWL.zip	The NCI Thesaurus 12.04e in Inferred OWL
	Thesaurus_12.04e.LexGrid.zip	The NCI Thesaurus 12.04e in LexGrid 5.1 XML
	

The zip files unpack the following files:

	Thesaurus_12.04e.FLAT.zip	Thesaurus.txt
	Thesaurus_12.04e.OWL.zip	Thesaurus.owl
	Thesaurus_12.04e.OWL-byCode.zip	Thesaurus.owl
	ThesaurusInf_12.04e.OWL.zip	ThesaurusInferred.owl
	Thesaurus_12.04e.LexGrid.zip	Thesaurus.xml

In the first two formats below, the relations are as stated by the editors, no inferred 
relations are specified.

The Thesaurus_12.04e.txt flat file is in tab-delimited format.  Included in this format 
are all the terms associated with NCI Thesaurus concepts (names and synonyms), a text 
definition of the concept (if one is present), and stated parent-child relations, sufficient 
to reconstruct the hierarchy.  The fields
are:

	code <tab> concept name <tab> parents <tab> synonyms <tab> definition

The "parents" field contains the concept name(s) of the superconcept(s).
If a "parents" or "synonyms" field contains multiple entries, these are pipe-delimited.  
For root concepts without "parents", this field contains the string "root_node".  The 
first entry in the "synonyms" field is the preferred name of the concept.  If no preferred 
name has been stated for the concept, this field contains the concept name.  The "definition" 
field contains only one definition if more than one definition is associated with the concept; 
not all concepts contain definitions.  

The Thesaurus.owl file packaged in Thesaurus_12.04e.OWL.zip (asserted OWL byName) contains the 
entire terminology expressed in the OWL web ontology language (http://www.w3.org/TR/owl-ref/), 
and the rdf:about (rdf:ID) utilizes semantically meaningful identifiers, e.g. "Gene".  
Relations are as stated by the editors; no inferred relations are specified.  
Annotations of use only to the EVS (for example, editor's notes) are absent in the released terminology.

The unzipped Thesaurus.owl is available directly at http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl


The Thesaurus.owl file packaged in Thesaurus_12.04e.OWL-byCode.zip 
(asserted OWL byCode) contains the entire terminology expressed in the 
Web Ontology Language (OWL), and the rdf:about (rdf:ID) utilizes semantically 
meaningless identifiers, e.g. "C123456", which is the value of the "code" 
property found in the asserted "byName" file.  Relations are as stated by the 
editors; no inferred relations are specified.  Annotations of use only to the 
EVS (for example, editor's notes) are absent in the released terminology.  
The rdf:about of annotation and object properties are also replaced with 
their corresponding codes.  The "code" property is removed from this file.

The ThesaurusInferred.owl file packaged in ThesaurusInf_12.04e.OWL.zip 
contains the terminology from the NCI Thesaurus but excludes retired concepts and 
includes inferred relationships. This file is created for import into the UMLS 
and NCI Metathesaurus. Properties of use only to the EVS (e.g. editor notes) are 
absent in the released terminology.  It also uses semantically meaningless 
rdf:about identifiers as described above for "asserted OWL byCode".

The Thesaurus.xml file packaged in Thesaurus_12.04e.LexGrid.zip
contains the entire terminology in the LexGrid XML format.  For more information 
about LexGrid XML, please see the Vocabulary Knowledge Center
(https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/LexGrid_Model_and_Schema).

The file nci_code_cui_map_201112.dat contains the map of NCI Thesaurus (NCIt) codes
to NCI Metathesaurus (NCIMeta) CUIs (concept unique identifier).  The format of the 
file is code <pipe> CUI <pipe>.  The CUIs are either derived from the UMLS Metathesaurus 
(CUIs that begin with "C-digit") or specific to the NCI Metathesaurus (CUIs that begin
with "CL").  This mapping file is created for each release of the NCIMeta, which is 
published on a quarterly schedule hence is not current with every release of NCIt which 
is published monthly.  Newly created Thesaurus concepts will not appear in this file 
until the next release of the Metathesaurus, and newly retired Thesaurus concepts will 
continue to appear until the next release of the Metathesaurus.  Archived mapping files 
may be found in the archives directory beneath the last release of Thesaurus it was 
generated from.  The current mapping file is derived from NCIt version 11.12e and
NCIMeta version 201112.

For previous releases of the NCI Thesaurus please see the archives at
ftp://ftp1.nci.nih.gov/pub/cacore/EVS/NCI_Thesaurus/archive/

For additional information, please see the Release Notes of LexEVS 5.1.
(https://wiki.nci.nih.gov/display/EVS/LexEVS+5.1+Release+Notes)