The great gene name clean-up

As part of our current funding cycle we are working towards naming genes for vertebrate species that currently do not have a dedicated gene nomenclature committee, to be consistent with their human orthologs. We have started some ‘behind the scenes’ work on chimpanzee genes and will bring you more news on this project soon. In preparation for transferring our names to other species we have been looking at ways to standardise and simplify gene name formats, including removing punctuation wherever this does not create ambiguity. We are also actively removing the names of other species for genes that have �been named after homologs, and removing molecular weights if these do not form part of the gene symbol. For cases where we are proposing simplifying the format of gene family names we are contacting our specialist advisors to discuss this with them; some example gene families that have undergone this process already include the ABC family, the NADH:ubiquinone oxidoreductase subunits of mitochondrial complex I and the tumor necrosis factor superfamily.

Public release of useful HGNC code

We are now releasing code that we think others could find useful via The first piece of code, pfam-dom-draw, takes an HGNC approved symbol and a UniProt accession and creates a diagram representing the UniProt protein with Pfam domains mapped on to the protein. This code is used to create the domain graphics on our gene family pages. The second piece of code, europe-pmcentralizer, takes PubMed IDs from a data-pubmed-ids attribute and creates references for each valid PubMed ID from Europe PubMed Central. By default a short reference is created but by clicking on the expand icon the abstract and the full author list is revealed. This code is used for references in our symbol reports and gene family pages.

Advance notice of removal of old gene family pages

Since the Spring we have provided a full release of our new, improved gene family resource. In order to smooth the transition to the new family data we have also been providing full access to our old gene family data, via the ‘Old gene families archive’ link under the Gene Families menu. However, we have not been updating these families - the data is a freeze from April 2014 and is increasingly becoming out of date. We hope that all of our users are now comfortable with the new gene family resource and hence we plan to remove the old gene family pages from in mid January. It will still be possible to download an archived file featuring the old gene family data via our FTP site, but this will not be updated and so we encourage everybody to use our new gene family data.

Links to FlyBase gene groups

We have recently worked directly with the curators at FlyBase to add links from our gene family pages to their equivalent Drosophila melanogaster gene group reports. An example HGNC gene family with such a link is the Caspases; the link to the FlyBase gene group: CASPASE can be seen in the external resources field. You can read all about FlyBase gene groups in their publication FlyBase: establishing a Gene Group resource for Drosophila melanogaster.

New Gene Family pages

We are continually adding more manually curated gene families to our site (so please use this dataset!). In recent months we have considerably increased the number of phosphatase families in our resource, and have included relationships between these families to reflect their position within the complex phosphatase hierarchy. You can browse through these families using the gene family hierarchy map on the Phosphatases page. We have also made a gene family page for the SMC5-6 protein complex which features a more standardised name format for the NSMCE# encoded subunits and the recently renamed NDNL2 gene, which is now NSMCE3 following agreement from the community. Other new gene families of note include the cullins, the membrane bound O-acyltransferases, the mediator complex, the spectrins and the spindlin family. If you have any suggestions for new gene families that we can add, please let us know.

Gene Symbols in the News

There has been some good news in the fight against pancreatic cancer: three proteins encoded by the LYVE1, REG1A and TFF1 genes have been found to be present at much higher levels in the urine of pancreatic cancer patients. Excitingly, this has led to the development of an accurate test to detect the disease. There is also hope in the search to identify women who are more at risk of heart attacks and strokes following the news that women carrying a specific variant of the BCAR1 gene suffer higher rates of these two diseases. Finally, work in Finland has found an association between individuals who carry a particular variant of the HTR2B gene and impulsive behaviour, such as random violence. These individuals always show higher levels of these behaviours but the effect is greatly magnified when they are inebriated.

Meeting News

Ruth attended the Non-Coding Genome 2015 meeting in Heidelberg, Germany from18th-21st October. She was interested to visit EMBL having worked at the EMBL-EBI outstation since 2008, and received positive feedback on the discussion points raised in her poster, ‘The challenges of naming long non-coding RNA genes’.

Susan will be attending the 2016 International Plant & Animal Genome (PAG) XXIV conference on January 9-13, 2016 in San Diego, USA, where she will be giving a talk as part of the ‘Genome annotation resources at the EBI Workshop’ and presenting a poster about our work on naming genes across vertebrate species.


Kalman LV, Agúndez JA, Appell ML, Black JL, Bell GC, Boukouvala S, Bruckner C, Bruford E, Bruckner C, Caudle K, Coulthard S, Daly AK, Del Tredici AL, den Dunnen JT, Drozda K, Everts R, Flockhart D, Freimuth R, Gaedigk A, Hachad H, Hartshorne T, Ingelman-Sundberg M, Klein TE, Lauschke VM, Maglott DR, McLeod HL, McMillin GA, Meyer UA, Müller DJ, Nickerson DA, Oetting WS, Pacanowski M, Pratt VM, Relling MV, Roberts A, Rubinstein WS, Sangkuhl K, Schwab M, Scott SA, Sim SC, Thirumaran RK, Toji LH, Tyndale R, van Schaik RH, Whirl-Carrillo M, Yeo KJ, Zanger UM. Pharmacogenetic Allele Nomenclature: International Workgroup Recommendations for Test Result Reporting. Clin Pharmacol Ther. 2015 Oct 19. doi: 10.1002/cpt.280. [Epub ahead of print] PMID:26479518

Krupska I, Bruford E, Chaqour B. Eyeing the Cyr61/CTGF/NOV (CCN) group of genes in development and diseases: highlights of their structural likenesses and functional dissimilarities. Hum Genomics 2015 Sep 23; 9(1):24. PMID:26395334 PMCID:PMC4579636

Bruford E, Lane L, Harrow J. Devising a consensus framework for validation of novel human coding loci. J Proteome Res. 2015 Sep 14 [Epub ahead of print] PMID:26367542