Skip to Navigation

GNAS2009 Abstract #2 - Maglott, Donna

Naming genes at NCBI - Donna Maglott

Entrez Gene assigns a unique identifier (GeneID) to genes identified by map location or sequence. The GeneID is used to track other key attributes of the gene, including identifiers from model organism databases and nomenclature groups, symbols, and full names. Names are assigned to these records in this order of priority:  1. From nomenclature authority, 2. from name assigned to ortholog in the genome designated as the naming model, 3. annotation on sequence submissions, 4. concatenation of 'LOC'+GeneID. The GeneID to sequence relationship is then used by multiple groups in NCBI, including (1) assigning names to RefSeq RNAs, (2) assigning names to genes represented in dbSNP, HomoloGene, GEO, UniGene, (3) identifying genes annotated in NCBI's genome annotation pipeline, etc. Names are maintained by integrating computational and curation-based data flows, with the latter supported by a web-accessible database that is available to collaborating groups. This talk will review these data flows.