Human-specific insertions and deletions inferred from mammalian genome sequences. 2007

Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
Division of Biostatistics and Bioinformatics, National Health Research Institute, Miaoli County 350, Taiwan.

It has been suggested that insertions and deletions (indels) have contributed to the sequence divergence between the human and chimpanzee genomes more than do nucleotide changes (3% vs. 1.2%). However, although there have been studies of large indels between the two genomes, no systematic analysis of small indels (i.e., indels </= 100 bp) has been published. In this study, we first estimated that the false-positive rate of small indels inferred from human-chimpanzee pairwise sequence alignments is quite high, suggesting that the chimpanzee genome draft is not sufficiently accurate for our purpose. We have therefore inferred only human-specific indels using multiple sequence alignments of mammalian genomes. We identified >840,000 "small" indels, which affect >7000 UCSC-annotated human genes (>11,000 transcripts). These indels, however, amount to only approximately 0.21% sequence change in the human lineage for the regions compared, whereas in pseudogenes indels contribute to a sequence divergence of 1.40%, suggesting that most of the indels that occurred in genic regions have been eliminated. Functional analysis reveals that the genes whose coding exons have been affected by human-specific indels are enriched in transcription and translation regulatory activities but are underrepresented in catalytic and transporter activities, cellular and physiological processes, and extracellular region/matrix. This functional bias suggests that human-specific indels might have contributed to human unique traits by causing changes at the RNA and protein level.

UI MeSH Term Description Entries
D008322 Mammals Warm-blooded vertebrate animals belonging to the class Mammalia, including all that possess hair and suckle their young. Mammalia,Mammal
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D002679 Pan troglodytes The common chimpanzee, a species of the genus Pan, family HOMINIDAE. It lives in Africa, primarily in the tropical rainforests. There are a number of recognized subspecies. Chimpanzee,Chimpanzee troglodytes,Chimpanzee troglodyte,Chimpanzees,Pan troglodyte,troglodyte, Pan,troglodytes, Chimpanzee
D002891 Chromosomes, Human, Pair 21 A specific pair of GROUP G CHROMOSOMES of the human chromosome classification. Chromosome 21
D005809 Genes, Regulator Genes which regulate or circumscribe the activity of other genes; specifically, genes which code for PROTEINS or RNAs which have GENE EXPRESSION REGULATION functions. Gene, Regulator,Regulator Gene,Regulator Genes,Regulatory Genes,Gene, Regulatory,Genes, Regulatory,Regulatory Gene
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D014158 Transcription, Genetic The biosynthesis of RNA carried out on a template of DNA. The biosynthesis of DNA from an RNA template is called REVERSE TRANSCRIPTION. Genetic Transcription
D014176 Protein Biosynthesis The biosynthesis of PEPTIDES and PROTEINS on RIBOSOMES, directed by MESSENGER RNA, via TRANSFER RNA that is charged with standard proteinogenic AMINO ACIDS. Genetic Translation,Peptide Biosynthesis, Ribosomal,Protein Translation,Translation, Genetic,Protein Biosynthesis, Ribosomal,Protein Synthesis, Ribosomal,Ribosomal Peptide Biosynthesis,mRNA Translation,Biosynthesis, Protein,Biosynthesis, Ribosomal Peptide,Biosynthesis, Ribosomal Protein,Genetic Translations,Ribosomal Protein Biosynthesis,Ribosomal Protein Synthesis,Synthesis, Ribosomal Protein,Translation, Protein,Translation, mRNA,mRNA Translations

Related Publications

Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
February 2004, Human mutation,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
January 2021, Frontiers in genetics,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
February 1991, Nucleic acids research,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
June 2011, Genome research,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
May 2010, Nature methods,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
July 2006, Gene,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
January 2012, PloS one,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
July 2006, BMC bioinformatics,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
November 1994, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme,
Feng-Chi Chen, and Chueng-Jong Chen, and Wen-Hsiung Li, and Trees-Juen Chuang
October 2010, Human molecular genetics,
Copied contents to your clipboard!