Standard Codon Substitution Models Overestimate Purifying Selection for Nonstationary Data. 2017

Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
Research School of Biology, College of Medicine, Biology, and Environment, Australian National University, Canberra, ACT, Australia.

Estimation of natural selection on protein-coding sequences is a key comparative genomics approach for de novo prediction of lineage-specific adaptations. Selective pressure is measured on a per-gene basis by comparing the rate of nonsynonymous substitutions to the rate of synonymous substitutions. All published codon substitution models have been time-reversible and thus assume that sequence composition does not change over time. We previously demonstrated that if time-reversible DNA substitution models are applied in the presence of changing sequence composition, the number of substitutions is systematically biased towards overestimation. We extend these findings to the case of codon substitution models and further demonstrate that the ratio of nonsynonymous to synonymous rates of substitution tends to be underestimated over three data sets of mammals, vertebrates, and insects. Our basis for comparison is a nonstationary codon substitution model that allows sequence composition to change. Goodness-of-fit results demonstrate that our new model tends to fit the data better. Direct measurement of nonstationarity shows that bias in estimates of natural selection and genetic distance increases with the degree of violation of the stationarity assumption. Additionally, inferences drawn under time-reversible models are systematically affected by compositional divergence. As genomic sequences accumulate at an accelerating rate, the importance of accurate de novo estimation of natural selection increases. Our results establish that our new model provides a more robust perspective on this fundamental quantity.

UI MeSH Term Description Entries
D008390 Markov Chains A stochastic process such that the conditional probability distribution for a state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system. Markov Process,Markov Chain,Chain, Markov,Chains, Markov,Markov Processes,Process, Markov,Processes, Markov
D008957 Models, Genetic Theoretical representations that simulate the behavior or activity of genetic processes or phenomena. They include the use of mathematical equations, computers, and other electronic equipment. Genetic Models,Genetic Model,Model, Genetic
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D003062 Codon A set of three nucleotides in a protein coding sequence that specifies individual amino acids or a termination signal (CODON, TERMINATOR). Most codons are universal, but some organisms do not produce the transfer RNAs (RNA, TRANSFER) complementary to all codons. These codons are referred to as unassigned codons (CODONS, NONSENSE). Codon, Sense,Sense Codon,Codons,Codons, Sense,Sense Codons
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000818 Animals Unicellular or multicellular, heterotrophic organisms, that have sensation and the power of voluntary movement. Under the older five kingdom paradigm, Animalia was one of the kingdoms. Under the modern three domain model, Animalia represents one of the many groups in the domain EUKARYOTA. Animal,Metazoa,Animalia
D012641 Selection, Genetic Differential and non-random reproduction of different genotypes, operating to alter the gene frequencies within a population. Natural Selection,Genetic Selection,Selection, Natural

Related Publications

Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
May 2000, Genetics,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
March 2008, Molecular biology and evolution,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
November 2008, Genetics,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
January 2019, Model assisted statistics and applications : an international journal,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
January 2016, Molecular phylogenetics and evolution,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
January 2015, Systematic biology,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
January 1984, Journal of molecular evolution,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
August 1999, Molecular biology and evolution,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
May 1998, Proceedings of the National Academy of Sciences of the United States of America,
Benjamin D Kaehler, and Von Bing Yap, and Gavin A Huttley
May 2018, IEEE transactions on cybernetics,
Copied contents to your clipboard!