Genome update: the 1000th genome--a cautionary tale. 2010

Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
Centre for Molecular Biology and Neuroscience, Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, NO-0027, Oslo, Norway, and Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo, Norway.

There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families - more genes than are recognized in the human genome. Moreover, of the 1000 genomes available, not a single protein is conserved across all genomes. Excluding the members of the Archaea, only a total of four genes are conserved in all bacteria: two protein genes and two RNA genes.

UI MeSH Term Description Entries
D005798 Genes, Bacterial The functional hereditary units of BACTERIA. Bacterial Gene,Bacterial Genes,Gene, Bacterial
D016680 Genome, Bacterial The genetic complement of a BACTERIA as represented in its DNA. Bacterial Genome,Bacterial Genomes,Genomes, Bacterial
D017422 Sequence Analysis, DNA A multistage process that includes cloning, physical mapping, subcloning, determination of the DNA SEQUENCE, and information analysis. DNA Sequence Analysis,Sequence Determination, DNA,Analysis, DNA Sequence,DNA Sequence Determination,DNA Sequence Determinations,DNA Sequencing,Determination, DNA Sequence,Determinations, DNA Sequence,Sequence Determinations, DNA,Analyses, DNA Sequence,DNA Sequence Analyses,Sequence Analyses, DNA,Sequencing, DNA
D055028 Comparative Genomic Hybridization A method for comparing two sets of chromosomal DNA by analyzing differences in the copy number and location of specific sequences. It is used to look for large sequence changes such as deletions, duplications, amplifications, or translocations. Array Comparative Genomic Hybridization,Array-Based Comparative Genomic Hybridization,Comparative Genome Hybridization,Array Based Comparative Genomic Hybridization,Comparative Genome Hybridizations,Comparative Genomic Hybridizations,Genome Hybridization, Comparative,Genome Hybridizations, Comparative,Genomic Hybridization, Comparative,Genomic Hybridizations, Comparative,Hybridization, Comparative Genome,Hybridization, Comparative Genomic,Hybridizations, Comparative Genome,Hybridizations, Comparative Genomic
D020745 Genome, Archaeal The genetic complement of an archaeal organism (ARCHAEA) as represented in its DNA. Archaeal Genome,Archaeal Genomes,Genomes, Archaeal
D023281 Genomics The systematic study of the complete DNA sequences (GENOME) of organisms. Included is construction of complete genetic, physical, and transcript maps, and the analysis of this structural genomic information on a global scale such as in GENOME WIDE ASSOCIATION STUDIES. Functional Genomics,Structural Genomics,Comparative Genomics,Genomics, Comparative,Genomics, Functional,Genomics, Structural
D030561 Databases, Nucleic Acid Databases containing information about NUCLEIC ACIDS such as BASE SEQUENCE; SNPS; NUCLEIC ACID CONFORMATION; and other properties. Information about the DNA fragments kept in a GENE LIBRARY or GENOMIC LIBRARY is often maintained in DNA databases. DDBJ,DNA Data Bank of Japan,DNA Data Banks,DNA Databases,Databases, DNA,Databases, DNA Sequence,Databases, Nucleic Acid Sequence,Databases, RNA,Databases, RNA Sequence,EMBL Nucleotide Sequence Database,GenBank,Nucleic Acid Databases,RNA Databases,DNA Databanks,DNA Sequence Databases,European Molecular Biology Laboratory Nucleotide Sequence Database,Nucleic Acid Sequence Databases,RNA Sequence Databases,Bank, DNA Data,Banks, DNA Data,DNA Data Bank,DNA Databank,DNA Database,DNA Sequence Database,Data Bank, DNA,Data Banks, DNA,Databank, DNA,Databanks, DNA,Database, DNA,Database, DNA Sequence,Database, Nucleic Acid,Database, RNA,Database, RNA Sequence,Nucleic Acid Database,RNA Database,RNA Sequence Database,Sequence Database, DNA,Sequence Database, RNA,Sequence Databases, DNA,Sequence Databases, RNA

Related Publications

Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
January 1989, Gastroenterology nursing : the official journal of the Society of Gastroenterology Nurses and Associates,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
November 2003, British dental journal,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
January 1989, Nursing standard (Royal College of Nursing (Great Britain) : 1987),
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
October 1994, Accident and emergency nursing,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
October 1997, Australian family physician,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
August 1977, Occupational health; a journal for occupational health nurses,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
September 2006, Journal of radiological protection : official journal of the Society for Radiological Protection,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
January 1992, Postgraduate medicine,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
January 1990, Nursing times,
Karin Lagesen, and Dave W Ussery, and Trudy M Wassenaar
April 1990, Pennsylvania medicine,
Copied contents to your clipboard!