February 7, 2008

Editing of NCI Thesaurus 07.12e was completed on December 3, 2007.  Version 
07.12e was Decebmer's first build in our development cycle.

This directory contains files:

	ReadMe.txt			This file
	Thesaurus_07.12e.XML.zip	The NCI Thesaurus version 07.12e in Apelon's XML format
	Thesaurus_07.12e.FLAT.zip	The NCI Thesaurus 07.12e in flat file format
	Thesaurus_07.12e.OWL.zip	The NCI Thesaurus 07.12e in OWL

The zip files unpack the following files:

	Thesaurus_07.12e.XML.zip	Thesaurus_07.12e.xml
	Thesaurus_07.12e.FLAT.zip	Thesaurus_07.12e.txt
	Thesaurus_07.12e.OWL.zip	Thesaurus.owl

In all three formats below, the ontology is in a defined state, i.e. 
relations are as stated by the editors, no inferred relations are
specified.

The Thesaurus_07.12e.xml file contains the entire terminology and associated 
ontologic constructions from the NCI Thesaurus, including properties, roles, 
and kinds.  The DTD for the XML is as defined by Apelon, Inc, whose editing 
tools are being used in the construction of the Thesaurus.  Properties of 
use only to the EVS (e.g. editor notes) are absent in the released terminology. 


The Thesaurus_07.12e.txt flat file is in tab-delimited format.  Included in this 
format are all the terms associated with NCI Thesaurus concepts (names and 
synonyms), a text definition of the concept (if one is present), and stated 
parent-child relations, sufficient to reconstruct the hierarchy.  The fields 
are:

	code <tab> concept name <tab> parents <tab> synonyms <tab> definition

The "parents" field contains the concept name(s) of the superconcept(s).
If a "parents" or "synonyms" field contains multiple entries, these 
are pipe-delimited.  For root concepts without "parents", this field
contains the string "root_node".  The first entry in the "synonyms" field 
is the preferred name of the concept.  If no preferred name has been stated
for the concept, this field contains the concept name.  The 
"definition" field contains only one definition if more than one 
definition is associated with the concept; not all concepts contain 
definitions.  

The Thesaurus.owl file contains the entire terminology expressed in the OWL 
web ontology language (http://www.w3.org/TR/owl-ref/), with the exception of
the Ontylog namespace declaration, which was deemed unnecessary.  The Ontylog
Roles where converted to restrictions on OWL properties, and most of the 
concept annotations in Ontylog properties were converted to OWL 
AnnotationProperty; as in the Ontylog xml file, properties of use only to 
the EVS (e.g. editor notes) are absent in the OWL file.  Because 
Roles in Ontylog are mapped from a domain kind to a range kind, the OWL 
version of the Thesaurus has each kind as a root class to facilitate the 
conversion of Roles to OWL properties.  The kind root classes are declared 
disjoint in the OWL file.  

The unzipped Thesaurus.owl is available directly at 
http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl

For additional information, please see the Release Notes of caCORE 3.2.