HGNC logo VCN Gene Definitions HUGO logo
About HGNC button Gene Search button Guidelines button Gene Submission button Downloads button Home button
Giving unique and meaningful names to every human gene

3. We need to agree upon a definition of VCN genes in order to distinguish them from segmental duplications present in virtually all humans and from spontaneous or de novo indels/rearrangements e.g. experimental evidence, number of individuals etc. One option would be to use a polymorphism definition: for example the occurrence of two or more alleles for a given locus in a population where the commonest allele has a frequency of less than 99% although the percentage would be open to discussion.

Response 1

I am in agreement with your definition of a copy number variant being a
locus where experimental data has shown that at least two copy number
variant (CNV) alleles exist with a frequency of >1% each in the general
population to be considered a copy number polymorphisms (CNP).  I presume
that you will include in the definition the description that all exons of
the gene need to be contiguously included in the extra copies or deleted
completely in cases where the copy number variants exist as 2, 1 and 0 in
the human population. (?)

Response 2

...These things are all the same, or at least they're part of a continuum,
so I don't understand what you are trying to 'distinguish'?
...I would caution about using the word or concept of 'polymorphism' at
all, as it totall population specific

Response 3

If you follow suggestions made in point #2 (limit to database attribute) above this
is largely irrelevant for the new copy-number variant events. For 'universal'
segmental duplications, functional studies (including expression and
sequencing studies) will be required to assess the status of the 'genes'.
As these studies are completed there will be nomenclature attached to the
'genes' that might include something like GENEAL1 or GENEAL2(ie.
GENEA-like1 or GENEA-like 2). I think it is dangerous to attach names to
these complex objects if there is no functional work completed that can be
referred to.

Response 4

made redundant by question 2

Response 5

Does not really matter if it is a database attribute

Response 6

Not a critical question, if 2A is the answer, but probably needs at
least 1% prevalence.

Response 7

I would use the >1% definition for any human population that can be defined
and where map information of the paralogue can be provided.

Response 8

This is more complicated and will be aided or confounded by the
quality of the data underlying the region and genes of interest.  For
this reason, perhaps a strict definition should be applied first and
see how a system would be used with a couple of example genes (e.g.,
HBA1 and CCL3L1).

Response 9

(Agrees with Response 6)

Response 10

(Agrees with Response 7)

Response 11

My comprehension (or definition) would be: "VCN" refers to a variation
in the number of putatively functional alleles on the same chromosome
and lying in tandem at the same genetic locus. Pseudogenes might exist
within this tandem segment.

Response 12

My guess is that where these exist they are going to occur with multiple
alleles and at relatively high frequency so 95% might well be a
sufficiently high threshold.  Detection is going to be the real issue -
how many samples must be analysed to detect alleles of < 5% frequency.


The work of the HGNC is supported by National Human Genome Research Institute (NHGRI) grant P41 HG03345 and Wellcome Trust grant 081979/Z/07/Z.