Structural organization of the 5' region of the thyroglobulin gene. Evidence for intron loss and "exonization" during evolution. 1987

J Parma, and D Christophe, and V Pohl, and G Vassart
Institut de Recherche Interdisciplinaire, Faculté de Médecine, Université Libre de Bruxelles.

More than one third of thyroglobulin (1190 residues out of 2750) is made of one peptide motif repeated ten times in tandem. Segments unrelated to the motif interrupt this structure at various places. The corresponding gene region, which extends over 40 x 10(3) bases, was studied in detail. All exon borders and exon/intron junctions were localized precisely and sequenced, and their positions were correlated with the repetitive organization of the protein. When intron positions were compiled on a consensus sequence of all repeats, three categories of introns were observed. Except between repeats numbers 5 and 6, an intron was invariably found within the Cys codon making the limit of each motif. This category of intron most probably reflects the serial duplication events responsible for the evolution of this region of the gene. All other introns, except no. 2, are found at positions were the repetitive structure is disrupted by "inserted" peptides. We present the hypothesis that this second category of introns was already present in the original unit before the first duplication. Thereafter, they would have experienced either complete loss (some units do not contain any intron) or partial or total exonization, resulting in the slipping of intronic material into coding sequence. Intron no. 2, finally, separates motif no. 1 at a position on the boundary between two segments presenting sequence homology. This last type of intron probably reflects an initial duplication event at the origin of a primordial thyroglobulin gene motif. With all these characteristics, the thyroglobulin gene is presented as a paradigm for the analysis of the fate of introns in gene evolution.

UI MeSH Term Description Entries
D007438 Introns Sequences of DNA in the genes that are located between the EXONS. They are transcribed along with the exons but are removed from the primary gene transcript by RNA SPLICING to leave mature RNA. Some introns code for separate genes. Intervening Sequences,Sequences, Intervening,Intervening Sequence,Intron,Sequence, Intervening
D008969 Molecular Sequence Data Descriptions of specific amino acid, carbohydrate, or nucleotide sequences which have appeared in the published literature and/or are deposited in and maintained by databanks such as GENBANK, European Molecular Biology Laboratory (EMBL), National Biomedical Research Foundation (NBRF), or other sequence repositories. Sequence Data, Molecular,Molecular Sequencing Data,Data, Molecular Sequence,Data, Molecular Sequencing,Sequencing Data, Molecular
D012091 Repetitive Sequences, Nucleic Acid Sequences of DNA or RNA that occur in multiple copies. There are several types: INTERSPERSED REPETITIVE SEQUENCES are copies of transposable elements (DNA TRANSPOSABLE ELEMENTS or RETROELEMENTS) dispersed throughout the genome. TERMINAL REPEAT SEQUENCES flank both ends of another sequence, for example, the long terminal repeats (LTRs) on RETROVIRUSES. Variations may be direct repeats, those occurring in the same direction, or inverted repeats, those opposite to each other in direction. TANDEM REPEAT SEQUENCES are copies which lie adjacent to each other, direct or inverted (INVERTED REPEAT SEQUENCES). DNA Repetitious Region,Direct Repeat,Genes, Selfish,Nucleic Acid Repetitive Sequences,Repetitive Region,Selfish DNA,Selfish Genes,DNA, Selfish,Repetitious Region, DNA,Repetitive Sequence,DNA Repetitious Regions,DNAs, Selfish,Direct Repeats,Gene, Selfish,Repeat, Direct,Repeats, Direct,Repetitious Regions, DNA,Repetitive Regions,Repetitive Sequences,Selfish DNAs,Selfish Gene
D002417 Cattle Domesticated bovine animals of the genus Bos, usually kept on a farm or ranch and used for the production of meat or dairy products or for heavy labor. Beef Cow,Bos grunniens,Bos indicus,Bos indicus Cattle,Bos taurus,Cow,Cow, Domestic,Dairy Cow,Holstein Cow,Indicine Cattle,Taurine Cattle,Taurus Cattle,Yak,Zebu,Beef Cows,Bos indicus Cattles,Cattle, Bos indicus,Cattle, Indicine,Cattle, Taurine,Cattle, Taurus,Cattles, Bos indicus,Cattles, Indicine,Cattles, Taurine,Cattles, Taurus,Cow, Beef,Cow, Dairy,Cow, Holstein,Cows,Dairy Cows,Domestic Cow,Domestic Cows,Indicine Cattles,Taurine Cattles,Taurus Cattles,Yaks,Zebus
D002874 Chromosome Mapping Any method used for determining the location of and relative distances between genes on a chromosome. Gene Mapping,Linkage Mapping,Genome Mapping,Chromosome Mappings,Gene Mappings,Genome Mappings,Linkage Mappings,Mapping, Chromosome,Mapping, Gene,Mapping, Genome,Mapping, Linkage,Mappings, Chromosome,Mappings, Gene,Mappings, Genome,Mappings, Linkage
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D005091 Exons The parts of a transcript of a split GENE remaining after the INTRONS are removed. They are spliced together to become a MESSENGER RNA or other functional RNA. Mini-Exon,Exon,Mini Exon,Mini-Exons
D005796 Genes A category of nucleic acid sequences that function as units of heredity and which code for the basic instructions for the development, reproduction, and maintenance of organisms. Cistron,Gene,Genetic Materials,Cistrons,Genetic Material,Material, Genetic,Materials, Genetic
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein

Related Publications

J Parma, and D Christophe, and V Pohl, and G Vassart
June 1984, European journal of biochemistry,
J Parma, and D Christophe, and V Pohl, and G Vassart
May 1987, European journal of biochemistry,
J Parma, and D Christophe, and V Pohl, and G Vassart
December 2000, European journal of endocrinology,
J Parma, and D Christophe, and V Pohl, and G Vassart
October 2001, European journal of endocrinology,
J Parma, and D Christophe, and V Pohl, and G Vassart
November 1988, Genetika,
J Parma, and D Christophe, and V Pohl, and G Vassart
October 1991, The Journal of biological chemistry,
J Parma, and D Christophe, and V Pohl, and G Vassart
May 1998, Genetics,
J Parma, and D Christophe, and V Pohl, and G Vassart
May 2001, Endocrinology,
J Parma, and D Christophe, and V Pohl, and G Vassart
September 1999, Thyroid : official journal of the American Thyroid Association,
J Parma, and D Christophe, and V Pohl, and G Vassart
September 1987, The EMBO journal,
Copied contents to your clipboard!