Sequence context at human single nucleotide polymorphisms: overrepresentation of CpG dinucleotide at polymorphic sites and suppression of variation in CpG islands. 2003

Daniel J Tomso, and Douglas A Bell
Laboratory of Computational Biology and Risk Analysis, National Institute of Environmental Health Sciences, C3-03 P.O. Box 12233, Research Triangle Park, NC 27709, USA.

Human polymorphisms originate as mutations, and the influence of context on mutagenesis should be reflected in the distribution of sequences surrounding single nucleotide polymorphisms (SNPs). We have performed a computational survey of nearly two million human SNPs to determine if sequence-dependent hotspots for polymorphism exist in the human genome. Here we show that sequences containing CpG dinucleotides, which occur at low frequencies in the human genome, are 6.7-fold more abundant at polymorphic sites than expected. In contrast, polymorphisms in CpG sequences located within CpG islands, important regulatory regions that modulate gene expression, are 6.8-fold less prevalent than expected. The distribution of polymorphic alleles at CpGs in CpG islands is also significantly different from that in non-island regions. These data strongly support a role for 5-methylcytosine deamination in the generation of human variation, and suggest that variation at CpGs in islands is suppressed.

UI MeSH Term Description Entries
D011110 Polymorphism, Genetic The regular and simultaneous occurrence in a single interbreeding population of two or more discontinuous genotypes. The concept includes differences in genotypes ranging in size from a single nucleotide site (POLYMORPHISM, SINGLE NUCLEOTIDE) to large nucleotide sequences visible at a chromosomal level. Gene Polymorphism,Genetic Polymorphism,Polymorphism (Genetics),Genetic Polymorphisms,Gene Polymorphisms,Polymorphism, Gene,Polymorphisms (Genetics),Polymorphisms, Gene,Polymorphisms, Genetic
D002875 Chromosomes In a prokaryotic cell or in the nucleus of a eukaryotic cell, a structure consisting of or containing DNA which carries the genetic information essential to the cell. (From Singleton & Sainsbury, Dictionary of Microbiology and Molecular Biology, 2d ed) Chromosome
D003596 Cytosine A pyrimidine base that is a fundamental unit of nucleic acids.
D003641 Deamination The removal of an amino group (NH2) from a chemical compound. Deaminations
D005787 Gene Frequency The proportion of one particular in the total of all ALLELES for one genetic locus in a breeding POPULATION. Allele Frequency,Genetic Equilibrium,Equilibrium, Genetic,Allele Frequencies,Frequencies, Allele,Frequencies, Gene,Frequency, Allele,Frequency, Gene,Gene Frequencies
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000483 Alleles Variant forms of the same gene, occupying the same locus on homologous CHROMOSOMES, and governing the variants in production of the same gene product. Allelomorphs,Allele,Allelomorph
D014644 Genetic Variation Genotypic differences observed among individuals in a population. Genetic Diversity,Variation, Genetic,Diversity, Genetic,Diversities, Genetic,Genetic Diversities,Genetic Variations,Variations, Genetic
D015894 Genome, Human The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs. Human Genome,Genomes, Human,Human Genomes
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases

Related Publications

Daniel J Tomso, and Douglas A Bell
January 2016, Molecular cytogenetics,
Daniel J Tomso, and Douglas A Bell
January 2005, Nucleic acids research,
Daniel J Tomso, and Douglas A Bell
January 2011, Nucleic acids research,
Daniel J Tomso, and Douglas A Bell
October 2021, Genes & genomics,
Daniel J Tomso, and Douglas A Bell
January 2001, Methods in molecular biology (Clifton, N.J.),
Copied contents to your clipboard!