Identification of protein-coding sequences using the hybridization of 18S rRNA and mRNA during translation. 2009

Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, USA. cx6@duke.edu

We introduce a new approach in this article to distinguish protein-coding sequences from non-coding sequences utilizing a period-3, free energy signal that arises from the interactions of the 3'-terminal nucleotides of the 18S rRNA with mRNA. We extracted the special features of the amplitude and the phase of the period-3 signal in protein-coding regions, which is not found in non-coding regions, and used them to distinguish protein-coding sequences from non-coding sequences. We tested on all the experimental genes from Saccharomyces cerevisiae and Schizosaccharomyces pombe. The identification was consistent with the corresponding information from GenBank, and produced better performance compared to existing methods that use a period-3 signal. The primary tests on some fly, mouse and human genes suggests that our method is applicable to higher eukaryotic genes. The tests on pseudogenes indicated that most pseudogenes have no period-3 signal. Some exploration of the 3'-tail of 18S rRNA and pattern analysis of protein-coding sequences supported further our assumption that the 3'-tail of 18S rRNA has a role of synchronization throughout translation elongation process. This, in turn, can be utilized for the identification of protein-coding sequences.

UI MeSH Term Description Entries
D010441 Peptide Chain Elongation, Translational A process of GENETIC TRANSLATION, when an amino acid is transferred from its cognate TRANSFER RNA to the lengthening chain of PEPTIDES. Chain Elongation, Peptide, Translational,Protein Biosynthesis Elongation,Protein Chain Elongation, Translational,Protein Translation Elongation,Translation Elongation, Genetic,Translation Elongation, Protein,Translational Elongation, Protein,Translational Peptide Chain Elongation,Biosynthesis Elongation, Protein,Elongation, Genetic Translation,Elongation, Protein Biosynthesis,Elongation, Protein Translation,Elongation, Protein Translational,Genetic Translation Elongation,Protein Translational Elongation
D012333 RNA, Messenger RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm. Messenger RNA,Messenger RNA, Polyadenylated,Poly(A) Tail,Poly(A)+ RNA,Poly(A)+ mRNA,RNA, Messenger, Polyadenylated,RNA, Polyadenylated,mRNA,mRNA, Non-Polyadenylated,mRNA, Polyadenylated,Non-Polyadenylated mRNA,Poly(A) RNA,Polyadenylated mRNA,Non Polyadenylated mRNA,Polyadenylated Messenger RNA,Polyadenylated RNA,RNA, Polyadenylated Messenger,mRNA, Non Polyadenylated
D012337 RNA, Ribosomal, 18S Constituent of the 40S subunit of eukaryotic ribosomes. 18S rRNA is involved in the initiation of polypeptide synthesis in eukaryotes. 18S Ribosomal RNA,18S RRNA,RNA, 18S Ribosomal,Ribosomal RNA, 18S
D016366 Open Reading Frames A sequence of successive nucleotide triplets that are read as CODONS specifying AMINO ACIDS and begin with an INITIATOR CODON and end with a stop codon (CODON, TERMINATOR). ORFs,Protein Coding Region,Small Open Reading Frame,Small Open Reading Frames,sORF,Unassigned Reading Frame,Unassigned Reading Frames,Unidentified Reading Frame,Coding Region, Protein,Frame, Unidentified Reading,ORF,Open Reading Frame,Protein Coding Regions,Reading Frame, Open,Reading Frame, Unassigned,Reading Frame, Unidentified,Region, Protein Coding,Unidentified Reading Frames
D017423 Sequence Analysis, RNA A multistage process that includes cloning, physical mapping, subcloning, sequencing, and information analysis of an RNA SEQUENCE. RNA Sequence Analysis,Sequence Determination, RNA,Analysis, RNA Sequence,Determination, RNA Sequence,Determinations, RNA Sequence,RNA Sequence Determination,RNA Sequence Determinations,RNA Sequencing,Sequence Determinations, RNA,Analyses, RNA Sequence,RNA Sequence Analyses,Sequence Analyses, RNA,Sequencing, RNA
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational
D020029 Base Pairing Pairing of purine and pyrimidine bases by HYDROGEN BONDING in double-stranded DNA or RNA. Base Pair,Base Pairs,Base Pairings
D029701 Saccharomyces cerevisiae Proteins Proteins obtained from the species SACCHAROMYCES CEREVISIAE. The function of specific proteins from this organism are the subject of intense scientific interest and have been used to derive basic understanding of the functioning similar proteins in higher eukaryotes. Baker's Yeast Proteins,S cerevisiae Proteins
D029702 Schizosaccharomyces pombe Proteins Proteins obtained from the species Schizosaccharomyces pombe. The function of specific proteins from this organism are the subject of intense scientific interest and have been used to derive basic understanding of the functioning similar proteins in higher eukaryotes. Fission Yeast Proteins,S pombe Proteins

Related Publications

Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
January 2014, Methods in molecular biology (Clifton, N.J.),
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
August 2016, Nature communications,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
February 1993, Nucleic acids research,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
January 2006, EURASIP journal on bioinformatics & systems biology,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
November 2019, Proceedings of the National Academy of Sciences of the United States of America,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
January 1991, Biochimica et biophysica acta,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
April 2008, Biochimie,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
February 2023, Journal of nematology,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
January 1981, Seikagaku. The Journal of Japanese Biochemical Society,
Chuanhua Xing, and Donald L Bitzer, and Winser E Alexander, and Mladen A Vouk, and Anne-Marie Stomp
March 2022, The Journal of biological chemistry,
Copied contents to your clipboard!