Reduced mRNA secondary-structure stability near the start codon indicates functional genes in prokaryotes. 2012

Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
Section of Integrative Biology, The University of Texas at Austin, TX, USA.

Several recent studies have found that selection acts on synonymous mutations at the beginning of genes to reduce mRNA secondary-structure stability, presumably to aid in translation initiation. This observation suggests that a metric of relative mRNA secondary-structure stability, Z(ΔG), could be used to test whether putative genes are likely to be functionally important. Using the Escherichia coli genome, we compared the mean Z(ΔG) of genes with known functions, genes with known orthologs, genes where function and orthology are unknown, and pseudogenes. Genes in the first two categories demonstrated similar levels of selection for reduced stability (increased Z(ΔG)), whereas for pseudogenes stability did not differ from our null expectation. Surprisingly, genes where function and orthology were unknown were also not different from the null expectation, suggesting that many of these open reading frames are not functionally important. We extended our analysis by constructing a Bayesian phylogenetic mixed model based on data from 145 prokaryotic genomes. As in E. coli, genes with no known function had consistently lower Z(ΔG), even though we expect that many of the currently unannotated genes will ultimately have their functional utility discovered. Our findings suggest that functional genes tend to evolve increased Z(ΔG), whereas nonfunctional ones do not. Therefore, Z(ΔG) may be a useful metric for identifying genes of potentially important function and could be used to target genes for further functional study.

UI MeSH Term Description Entries
D009690 Nucleic Acid Conformation The spatial arrangement of the atoms of a nucleic acid or polynucleotide that results in its characteristic 3-dimensional shape. DNA Conformation,RNA Conformation,Conformation, DNA,Conformation, Nucleic Acid,Conformation, RNA,Conformations, DNA,Conformations, Nucleic Acid,Conformations, RNA,DNA Conformations,Nucleic Acid Conformations,RNA Conformations
D011387 Prokaryotic Cells Cells lacking a nuclear membrane so that the nuclear material is either scattered in the cytoplasm or collected in a nucleoid region. Cell, Prokaryotic,Cells, Prokaryotic,Prokaryotic Cell
D011544 Pseudogenes Genes bearing close resemblance to known genes at different loci, but rendered non-functional by additions or deletions in structure that prevent normal transcription or translation. When lacking introns and containing a poly-A segment near the downstream end (as a result of reverse copying from processed nuclear RNA into double-stranded DNA), they are called processed genes. Genes, Processed,beta-Tubulin Pseudogene,Gene, Processed,Processed Gene,Processed Genes,Pseudogene,Pseudogene, beta-Tubulin,Pseudogenes, beta-Tubulin,beta Tubulin Pseudogene,beta-Tubulin Pseudogenes
D004926 Escherichia coli A species of gram-negative, facultatively anaerobic, rod-shaped bacteria (GRAM-NEGATIVE FACULTATIVELY ANAEROBIC RODS) commonly found in the lower part of the intestine of warm-blooded animals. It is usually nonpathogenic, but some strains are known to produce DIARRHEA and pyogenic infections. Pathogenic strains (virotypes) are classified by their specific pathogenic mechanisms such as toxins (ENTEROTOXIGENIC ESCHERICHIA COLI), etc. Alkalescens-Dispar Group,Bacillus coli,Bacterium coli,Bacterium coli commune,Diffusely Adherent Escherichia coli,E coli,EAggEC,Enteroaggregative Escherichia coli,Enterococcus coli,Diffusely Adherent E. coli,Enteroaggregative E. coli,Enteroinvasive E. coli,Enteroinvasive Escherichia coli
D005798 Genes, Bacterial The functional hereditary units of BACTERIA. Bacterial Gene,Bacterial Genes,Gene, Bacterial
D001105 Archaea One of the three domains of life (the others being BACTERIA and Eukarya), formerly called Archaebacteria under the taxon Bacteria, but now considered separate and distinct. They are characterized by: (1) the presence of characteristic tRNAs and ribosomal RNAs; (2) the absence of peptidoglycan cell walls; (3) the presence of ether-linked lipids built from branched-chain subunits; and (4) their occurrence in unusual habitats. While archaea resemble bacteria in morphology and genomic organization, they resemble eukarya in their method of genomic replication. The domain contains at least four kingdoms: CRENARCHAEOTA; EURYARCHAEOTA; NANOARCHAEOTA; and KORARCHAEOTA. Archaebacteria,Archaeobacteria,Archaeon,Archebacteria
D012333 RNA, Messenger RNA sequences that serve as templates for protein synthesis. Bacterial mRNAs are generally primary transcripts in that they do not require post-transcriptional processing. Eukaryotic mRNA is synthesized in the nucleus and must be exported to the cytoplasm for translation. Most eukaryotic mRNAs have a sequence of polyadenylic acid at the 3' end, referred to as the poly(A) tail. The function of this tail is not known for certain, but it may play a role in the export of mature mRNA from the nucleus as well as in helping stabilize some mRNA molecules by retarding their degradation in the cytoplasm. Messenger RNA,Messenger RNA, Polyadenylated,Poly(A) Tail,Poly(A)+ RNA,Poly(A)+ mRNA,RNA, Messenger, Polyadenylated,RNA, Polyadenylated,mRNA,mRNA, Non-Polyadenylated,mRNA, Polyadenylated,Non-Polyadenylated mRNA,Poly(A) RNA,Polyadenylated mRNA,Non Polyadenylated mRNA,Polyadenylated Messenger RNA,Polyadenylated RNA,RNA, Polyadenylated Messenger,mRNA, Non Polyadenylated
D012372 ROC Curve A graphic means for assessing the ability of a screening test to discriminate between healthy and diseased persons; may also be used in other studies, e.g., distinguishing stimuli responses as to a faint stimuli or nonstimuli. ROC Analysis,Receiver Operating Characteristic,Analysis, ROC,Analyses, ROC,Characteristic, Receiver Operating,Characteristics, Receiver Operating,Curve, ROC,Curves, ROC,ROC Analyses,ROC Curves,Receiver Operating Characteristics
D015964 Gene Expression Regulation, Bacterial Any of the processes by which cytoplasmic or intercellular factors influence the differential control of gene action in bacteria. Bacterial Gene Expression Regulation,Regulation of Gene Expression, Bacterial,Regulation, Gene Expression, Bacterial
D016015 Logistic Models Statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable. A common application is in epidemiology for estimating an individual's risk (probability of a disease) as a function of a given risk factor. Logistic Regression,Logit Models,Models, Logistic,Logistic Model,Logistic Regressions,Logit Model,Model, Logistic,Model, Logit,Models, Logit,Regression, Logistic,Regressions, Logistic

Related Publications

Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
March 2011, BMC evolutionary biology,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
February 2010, PLoS computational biology,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
January 1990, Nucleic acids symposium series,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
January 2006, Nucleic acids research,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
March 2022, Biotechnology reports (Amsterdam, Netherlands),
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
October 2006, Biochemical and biophysical research communications,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
March 2011, Peptides,
Thomas E Keller, and S David Mis, and Kevin E Jia, and Claus O Wilke
January 1990, Biomedica biochimica acta,
Copied contents to your clipboard!