Tandemly duplicated Caenorhabditis elegans collagen genes differ in their modes of splicing. 1990

Y S Park, and J M Kramer
University of Illinois, Department of Biological Sciences, Chicago 60680.

Caenorhabditis elegans contains 50 to 150 collagen genes dispersed throughout its genome. We have determined the complete nucleotide sequences of two collagen genes, col-12 and col-13, that are separated by only 1800 bases and are transcribed in the same direction. The 951 nucleotides of their coding regions differ by only five nucleotides (99.5% identity). The amino acid sequences are identical except for two conservative amino acid changes within the putative secretory signal sequences, so the mature forms of the col-12 and col-13 collagens would be identical. The position and sequence of the intron (52 base-pairs) within the coding region of each gene are perfectly conserved. In contrast to the coding regions and the introns, the 5' and 3' flanking regions show little sequence similarity, col-12 and col-13 are expressed at similar levels at the same developmental stages, and appear to utilize conserved TATA boxes and transcription start sites. The major differences between the genes is that, preceding the initiator ATG, col-12 has a cis-spliced intron, while col-13 is transspliced. Thus, col-12 and col-13 are essentially identical in all aspects except that the col-12 mRNA has a 26-nucleotide cis-spliced leader at the same place where the col-13 mRNA has a 22-nucleotide trans-spliced leader. These results suggest that col-12 and col-13 are derived from a gene duplication and that sequence homology in the coding regions, but not in the flanking regions, has been maintained by gene conversion. The fact that the only significant difference between the two genes is in their modes of splicing suggests that cis and trans-splicing can be interchanged during gene evolution.

UI MeSH Term Description Entries
D007438 Introns Sequences of DNA in the genes that are located between the EXONS. They are transcribed along with the exons but are removed from the primary gene transcript by RNA SPLICING to leave mature RNA. Some introns code for separate genes. Intervening Sequences,Sequences, Intervening,Intervening Sequence,Intron,Sequence, Intervening
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D009693 Nucleic Acid Hybridization Widely used technique which exploits the ability of complementary sequences in single-stranded DNAs or RNAs to pair with each other to form a double helix. Hybridization can take place between two complimentary DNA sequences, between a single-stranded DNA and a complementary RNA, or between two RNA sequences. The technique is used to detect and isolate specific sequences, measure homology, or define other characteristics of one or both strands. (Kendrew, Encyclopedia of Molecular Biology, 1994, p503) Genomic Hybridization,Acid Hybridization, Nucleic,Acid Hybridizations, Nucleic,Genomic Hybridizations,Hybridization, Genomic,Hybridization, Nucleic Acid,Hybridizations, Genomic,Hybridizations, Nucleic Acid,Nucleic Acid Hybridizations
D009710 Nucleotide Mapping Two-dimensional separation and analysis of nucleotides. Fingerprints, Nucleotide,Fingerprint, Nucleotide,Mapping, Nucleotide,Mappings, Nucleotide,Nucleotide Fingerprint,Nucleotide Fingerprints,Nucleotide Mappings
D002107 Caenorhabditis A genus of small free-living nematodes. Two species, CAENORHABDITIS ELEGANS and C. briggsae are much used in studies of genetics, development, aging, muscle chemistry, and neuroanatomy. Caenorhabditides
D003094 Collagen A polypeptide substance comprising about one third of the total protein in mammalian organisms. It is the main constituent of SKIN; CONNECTIVE TISSUE; and the organic substance of bones (BONE AND BONES) and teeth (TOOTH). Avicon,Avitene,Collagen Felt,Collagen Fleece,Collagenfleece,Collastat,Dermodress,Microfibril Collagen Hemostat,Pangen,Zyderm,alpha-Collagen,Collagen Hemostat, Microfibril,alpha Collagen
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D005796 Genes A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Cistron,Gene,Genetic Materials,Cistrons,Genetic Material,Material, Genetic,Materials, Genetic
D005810 Multigene Family A set of genes descended by duplication and variation from some ancestral gene. Such genes may be clustered together on the same chromosome or dispersed on different chromosomes. Examples of multigene families include those that encode the hemoglobins, immunoglobulins, histocompatibility antigens, actins, tubulins, keratins, collagens, heat shock proteins, salivary glue proteins, chorion proteins, cuticle proteins, yolk proteins, and phaseolins, as well as histones, ribosomal RNA, and transfer RNA genes. The latter three are examples of reiterated genes, where hundreds of identical genes are present in a tandem array. (King & Stanfield, A Dictionary of Genetics, 4th ed) Gene Clusters,Genes, Reiterated,Cluster, Gene,Clusters, Gene,Families, Multigene,Family, Multigene,Gene Cluster,Gene, Reiterated,Multigene Families,Reiterated Gene,Reiterated Genes
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein

Related Publications

Y S Park, and J M Kramer
June 2006, Journal of molecular evolution,
Y S Park, and J M Kramer
January 2000, Trends in genetics : TIG,
Y S Park, and J M Kramer
February 2016, Journal of genetics and genomics = Yi chuan xue bao,
Y S Park, and J M Kramer
September 1992, The Journal of biological chemistry,
Y S Park, and J M Kramer
November 1984, Molecular and cellular biology,
Copied contents to your clipboard!