Recent segmental and gene duplications in the mouse genome. 2003

Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G 1X8, Canada. steve@genet.sickkids.on.ca

BACKGROUND The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (>/= 5 kb) and recent (>/= 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies. RESULTS We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice. CONCLUSIONS Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.

UI MeSH Term Description Entries
D005796 Genes A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Cistron,Gene,Genetic Materials,Cistrons,Genetic Material,Material, Genetic,Materials, Genetic
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D015894 Genome, Human The complete genetic complement contained in the DNA of a set of CHROMOSOMES in a HUMAN. The length of the human genome is about 3 billion base pairs. Human Genome,Genomes, Human,Human Genomes
D016678 Genome The genetic complement of an organism, including all of its GENES, as represented in its DNA, or in some cases, its RNA. Genomes
D017404 In Situ Hybridization, Fluorescence A type of IN SITU HYBRIDIZATION in which target sequences are stained with fluorescent dye so their location and size can be determined using fluorescence microscopy. This staining is sufficiently distinct that the hybridization signal can be seen both in metaphase spreads and in interphase nuclei. FISH Technique,Fluorescent in Situ Hybridization,Hybridization in Situ, Fluorescence,FISH Technic,Hybridization in Situ, Fluorescent,In Situ Hybridization, Fluorescent,FISH Technics,FISH Techniques,Technic, FISH,Technics, FISH,Technique, FISH,Techniques, FISH
D051379 Mice The common name for the genus Mus. Mice, House,Mus,Mus musculus,Mice, Laboratory,Mouse,Mouse, House,Mouse, Laboratory,Mouse, Swiss,Mus domesticus,Mus musculus domesticus,Swiss Mice,House Mice,House Mouse,Laboratory Mice,Laboratory Mouse,Mice, Swiss,Swiss Mouse,domesticus, Mus musculus
D019143 Evolution, Molecular The process of cumulative change at the level of DNA; RNA; and PROTEINS, over successive generations. Molecular Evolution,Genetic Evolution,Evolution, Genetic
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational
D020131 Genes, Duplicate Two identical genes showing the same phenotypic action but localized in different regions of a chromosome or on different chromosomes. (From Rieger et al., Glossary of Genetics: Classical and Molecular, 5th ed) Duplicate Genes,Duplicate Gene,Gene, Duplicate

Related Publications

Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
August 2002, Science (New York, N.Y.),
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
December 2009, BMC genomics,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
November 2018, Genes,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
May 2004, Genome research,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
October 2004, Nature,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
July 2013, BMC genomics,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
September 2005, Nature,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
January 2003, Cold Spring Harbor symposia on quantitative biology,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
July 2021, BMC genomics,
Joseph Cheung, and Michael D Wilson, and Junjun Zhang, and Razi Khaja, and Jeffrey R MacDonald, and Henry H Q Heng, and Ben F Koop, and Stephen W Scherer
January 2006, Methods in molecular biology (Clifton, N.J.),
Copied contents to your clipboard!