A comparative genomic method for computational identification of prokaryotic translation initiation sites. 2002

Megon Walker, and Vladimir Pavlovic, and Simon Kasif
Bioinformatics Program, Boston University, Boston, MA 02215, USA.

The ever growing number of completely sequenced prokaryotic genomes facilitates cross-species comparisons by genomic annotation algorithms. This paper introduces a new probabilistic framework for comparative genomic analysis and demonstrates its utility in the context of improving the accuracy of prokaryotic gene start site detection. Our frame work employs a product hidden Markov model (PROD-HMM) with state architecture to model the species-specific trinucleotide frequency patterns in sequences immediately upstream and downstream of a translation start site and to detect the contrasting non-synonymous (amino acid changing) and synonymous (silent) substitution rates that differentiate prokaryotic coding from intergenic regions. Depending on the intricacy of the features modeled by the hidden state architecture, intergenic, regulatory, promoter and coding regions can be delimited by this method. The new system is evaluated using a preliminary set of orthologous Pyrococcus gene pairs, for which it demonstrates an improved accuracy of detection. Its robustness is confirmed by analysis with cross-validation of an experimentally verified set of Escherichia coli K-12 and Salmonella thyphimurium LT2 orthologs. The novel architecture has a number of attractive features that distinguish it from previous comparative models such as pair-HMMs.

UI MeSH Term Description Entries
D011387 Prokaryotic Cells Cells lacking a nuclear membrane so that the nuclear material is either scattered in the cytoplasm or collected in a nucleoid region. Cell, Prokaryotic,Cells, Prokaryotic,Prokaryotic Cell
D004926 Escherichia coli A species of gram-negative, facultatively anaerobic, rod-shaped bacteria (GRAM-NEGATIVE FACULTATIVELY ANAEROBIC RODS) commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce DIARRHEA and pyogenic infections. Pathogenic strains (virotypes) are classified by their specific pathogenic mechanisms such as toxins (ENTEROTOXIGENIC ESCHERICHIA COLI), etc. Alkalescens-Dispar Group,Bacillus coli,Bacterium coli,Bacterium coli commune,Diffusely Adherent Escherichia coli,E coli,EAggEC,Enteroaggregative Escherichia coli,Enterococcus coli,Diffusely Adherent E. coli,Enteroaggregative E. coli,Enteroinvasive E. coli,Enteroinvasive Escherichia coli
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001426 Bacterial Proteins Proteins found in any species of bacterium. Bacterial Gene Products,Bacterial Gene Proteins,Gene Products, Bacterial,Bacterial Gene Product,Bacterial Gene Protein,Bacterial Protein,Gene Product, Bacterial,Gene Protein, Bacterial,Gene Proteins, Bacterial,Protein, Bacterial,Proteins, Bacterial
D001483 Base Sequence The sequence of PURINES and PYRIMIDINES in nucleic acids and polynucleotides. It is also called nucleotide sequence. DNA Sequence,Nucleotide Sequence,RNA Sequence,DNA Sequences,Base Sequences,Nucleotide Sequences,RNA Sequences,Sequence, Base,Sequence, DNA,Sequence, Nucleotide,Sequence, RNA,Sequences, Base,Sequences, DNA,Sequences, Nucleotide,Sequences, RNA
D012486 Salmonella typhimurium A serotype of Salmonella enterica that is a frequent agent of Salmonella gastroenteritis in humans. It also causes PARATYPHOID FEVER. Salmonella typhimurium LT2
D014176 Protein Biosynthesis The biosynthesis of PEPTIDES and PROTEINS on RIBOSOMES, directed by MESSENGER RNA, via TRANSFER RNA that is charged with standard proteinogenic AMINO ACIDS. Genetic Translation,Peptide Biosynthesis, Ribosomal,Protein Translation,Translation, Genetic,Protein Biosynthesis, Ribosomal,Protein Synthesis, Ribosomal,Ribosomal Peptide Biosynthesis,mRNA Translation,Biosynthesis, Protein,Biosynthesis, Ribosomal Peptide,Biosynthesis, Ribosomal Protein,Genetic Translations,Ribosomal Protein Biosynthesis,Ribosomal Protein Synthesis,Synthesis, Ribosomal Protein,Translation, Protein,Translation, mRNA,mRNA Translations
D015203 Reproducibility of Results The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results. Reliability and Validity,Reliability of Result,Reproducibility Of Result,Reproducibility of Finding,Validity of Result,Validity of Results,Face Validity,Reliability (Epidemiology),Reliability of Results,Reproducibility of Findings,Test-Retest Reliability,Validity (Epidemiology),Finding Reproducibilities,Finding Reproducibility,Of Result, Reproducibility,Of Results, Reproducibility,Reliabilities, Test-Retest,Reliability, Test-Retest,Result Reliabilities,Result Reliability,Result Validities,Result Validity,Result, Reproducibility Of,Results, Reproducibility Of,Test Retest Reliability,Validity and Reliability,Validity, Face
D016680 Genome, Bacterial The genetic complement of a BACTERIA as represented in its DNA. Bacterial Genome,Bacterial Genomes,Genomes, Bacterial
D018387 Codon, Initiator A codon that directs initiation of protein translation (TRANSLATION, GENETIC) by stimulating the binding of initiator tRNA (RNA, TRANSFER, MET). In prokaryotes, the codons AUG or GUG can act as initiators while in eukaryotes, AUG is the only initiator codon. Codon, Start,Initiator Codon,Codon, Initiation,Start Codon,Codons, Initiation,Codons, Initiator,Codons, Start,Initiation Codon,Initiation Codons,Initiator Codons,Start Codons

Related Publications

Megon Walker, and Vladimir Pavlovic, and Simon Kasif
September 2005, Bioinformatics (Oxford, England),
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
July 2006, Nucleic acids research,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
March 2010, BMC bioinformatics,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
October 2004, BMC bioinformatics,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
February 2004, BMC bioinformatics,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
July 2011, Bioinformatics (Oxford, England),
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
September 2002, Journal of biochemistry and molecular biology,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
February 2020, RNA biology,
Megon Walker, and Vladimir Pavlovic, and Simon Kasif
November 1994, Trends in genetics : TIG,
Copied contents to your clipboard!