Gene families help
Help and information related to the new gene families reports.
Table of Contents
We strongly encourage naming families and groups of genes related by sequence and/or function using a "root" symbol. This is an efficient and informative way to name related genes, and already works well for a number of established gene families. Our gene family data are now fully searchable via our search tool, please see search help for further information.
Gene family index
Our gene family index is ordered alphabetically according to family name, with root symbols shown in a separate column. Results are paginated and the user can choose to alter the number of results per page. Typing in either search box narrows down the results based on the input, as shown in the example below.
Fig. 1 The new gene family index
Gene family pages
Our gene family pages have been redesigned to provide more information, conform to a more standardised format, and make it easier to browse between families. An example gene family page for Cholinergic receptors, muscarinic is shown below.
Fig. 2 The new Gene Family page for "Cholinergic receptors, muscarinic"
Gene family names, IDs and aliases
Each gene family has a unique numerical ID that forms the last part of the gene family page URL to aid linking and downloading. For example the numerical ID for "Cholinergic receptors, muscarinic" is 180. Each family also has a unique gene family name and, where a family has a root symbol, this is shown in parentheses next to the name. Please note that not all gene family pages equate to a particular set of genes with the same root symbol; in these cases only a gene family name is displayed. Other commonly-used gene family names and abbreviations are listed following the text “Also known as:” e.g. "Cholinergic receptors, muscarinic" are also known as "Muscarinic acetylcholine receptors", "Muscarinic receptors" and "mAChRs".
Gene family hierarchy map
The gene family pages provide a display of curated hierarchical relationships between families and allow users to browse easily through each hierarchy. As shown in Fig. 2 all gene families that fall into hierarchies include a “Gene family hierarchy map”. Hovering on any gene family within the map opens a pop-up that contains a link to that particular gene family page, while clicking and holding the mouse button highlights the current gene family and its direct relatives, and connects them with a red outline. Users can move each gene family within the page by clicking and dragging the box which allows the user to reorganise the diagram. In addition to the map, text links to any related families are provided within the page. For example, in Fig 2 there are direct links to the Amine receptors and Cholinergic receptors.
Gene family descriptions
Many of our gene family pages contain a description of the family. These are often from Wikipedia (as shown in Fig 2) or UniProt (e.g. Integrins), in which case the source is clearly marked with a link through to the original page. In some cases the descriptions have been written by HGNC curators and these can be identified by [Source: HGNC] (e.g. IGH orphons); if they come from another source this will be clearly displayed within square brackets.
Example gene mapped domains graphic
Where gene family members share a particular protein domain we often show a graphical display of the protein domain structure for an example gene family member, which is sourced from Pfam via UniProt ID. In Fig. 2 the domain structure is shown for the product of the CHRM1 gene. Hovering over a domain within the graphic will reveal a label containing the domain name, description and Pfam family ID, while clicking on a domain takes the user through to the Pfam description page for that domain.
Genes within the family
HGNC Symbol Reports for each gene within a family can be accessed by clicking on the Approved Symbol. By default the table of family members is sorted by Approved Symbol, but where the family shares a root symbol the members can be sorted by that symbol even where it is a synonym or previous symbol, e.g. in the DEAD box polypeptides (DDX) family INTS6 is sorted by its previous symbol of DDX26. Note that the symbol used for sorting is highlighted in green to make this clear.
Above the table of family members there is a small pie chart symbol next to the text “Genes contained within the family”. Clicking on the pie chart opens a pop-up box containing statistics on the locus types of the genes included in the family. For example, the Tubulin gene family contains 22 protein-coding genes and four pseudogenes.
Gene family downloads
We now provide a way of downloading gene families as data sets, allowing users to choose between downloading a single family or the entire family hierarchy. For example, users can choose between downloading only the Cholinergic receptors, muscarinic genes shown in Fig 2 or they can browse through to the G protein-coupled receptors and download all genes belonging to that hierarchy. Each gene family page has a download link at the bottom of the page that generates a text file with all gene symbols and extra data fields such as “Approved Name” and “HGNC ID”. Please note that some gene family pages do not contain a list of genes because these are included to complete the hierarchical structure; these pages enable users to download all the genes from further down the hierarchy, e.g. Amine receptors and Cholinergic receptors.
Summary of core data fields
Gene family name and root symbol - At the top of the gene family report page we display the family name and if applicable the common root symbol of the genes associated with the gene family within rounded brackets.
Also known as - Synonymous names for the gene family.
A subset of - This field contains links to families that the current set belongs.
Family contains the following subsets - Contains links to sub sets within the current family.
Specialist advisors - Names of specialists that advise and recommend appropriate gene symbols to the committee for a particular gene family.
External resources - Links to resources that will provide extra information about the current family.
Associated publications - References pertinent to the gene family. The user can choose to view these references at either PubMed or European PubMed Central. This section does not aim to list all possible published papers on the set but provides links to papers that first described the gene family in question or papers that are particularly relevant to the nomenclature of the genes.
Downloads - Download gene family data in a csv text format.