Bayesian segmental models with multiple sequence alignment profiles for protein secondary structure and contact map prediction. 2006

Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
Gatsby Computational Neuroscience Unit, University College London, London, UK. chuwei@gatsby.ucl.ac.uk

In this paper, we develop a segmental semi-Markov model (SSMM) for protein secondary structure prediction which incorporates multiple sequence alignment profiles with the purpose of improving the predictive performance. The segmental model is a generalization of the hidden Markov model where a hidden state generates segments of various length and secondary structure type. A novel parameterized model is proposed for the likelihood function that explicitly represents multiple sequence alignment profiles to capture the segmental conformation. Numerical results on benchmark data sets show that incorporating the profiles results in substantial improvements and the generalization performance is promising. By incorporating the information from long range interactions in beta-sheets, this model is also capable of carrying out inference on contact maps. This is an important advantage of probabilistic generative models over the traditional discriminative approach to protein secondary structure prediction. The Web server of our algorithm and supplementary materials are available at http://public.kgi.edu/-wild/bsm.html.

UI MeSH Term Description Entries
D008390 Markov Chains A stochastic process such that the conditional probability distribution for a state at any future instant, given the present state, is unaffected by any additional knowledge of the past history of the system. Markov Process,Markov Chain,Chain, Markov,Chains, Markov,Markov Processes,Process, Markov,Processes, Markov
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D009419 Nerve Tissue Proteins Proteins, Nerve Tissue,Tissue Proteins, Nerve
D004546 Elapid Venoms Venoms from snakes of the family Elapidae, including cobras, kraits, mambas, coral, tiger, and Australian snakes. The venoms contain polypeptide toxins of various kinds, cytolytic, hemolytic, and neurotoxic factors, but fewer enzymes than viper or crotalid venoms. Many of the toxins have been characterized. Cobra Venoms,Elapidae Venom,Elapidae Venoms,Naja Venoms,Cobra Venom,Elapid Venom,Hydrophid Venom,Hydrophid Venoms,King Cobra Venom,Naja Venom,Ophiophagus hannah Venom,Sea Snake Venom,Sea Snake Venoms,Venom, Cobra,Venom, Elapid,Venom, Elapidae,Venom, Hydrophid,Venom, King Cobra,Venom, Naja,Venom, Ophiophagus hannah,Venom, Sea Snake,Venoms, Cobra,Venoms, Elapid,Venoms, Elapidae,Venoms, Hydrophid,Venoms, Naja,Venoms, Sea Snake
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D001499 Bayes Theorem A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result. Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D012372 ROC Curve A graphic means for assessing the ability of a screening test to discriminate between healthy and diseased persons; may also be used in other studies, e.g., distinguishing stimuli responses as to a faint stimuli or nonstimuli. ROC Analysis,Receiver Operating Characteristic,Analysis, ROC,Analyses, ROC,Characteristic, Receiver Operating,Characteristics, Receiver Operating,Curve, ROC,Curves, ROC,ROC Analyses,ROC Curves,Receiver Operating Characteristics
D015203 Reproducibility of Results The statistical reproducibility of measurements (often in a clinical context), including the testing of instrumentation or techniques to obtain reproducible results. The concept includes reproducibility of physiological measurements, which may be used to develop rules to assess probability or prognosis, or response to a stimulus; reproducibility of occurrence of a condition; and reproducibility of experimental results. Reliability and Validity,Reliability of Result,Reproducibility Of Result,Reproducibility of Finding,Validity of Result,Validity of Results,Face Validity,Reliability (Epidemiology),Reliability of Results,Reproducibility of Findings,Test-Retest Reliability,Validity (Epidemiology),Finding Reproducibilities,Finding Reproducibility,Of Result, Reproducibility,Of Results, Reproducibility,Reliabilities, Test-Retest,Reliability, Test-Retest,Result Reliabilities,Result Reliability,Result Validities,Result Validity,Result, Reproducibility Of,Results, Reproducibility Of,Test Retest Reliability,Validity and Reliability,Validity, Face
D015233 Models, Statistical Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc. Probabilistic Models,Statistical Models,Two-Parameter Models,Model, Statistical,Models, Binomial,Models, Polynomial,Statistical Model,Binomial Model,Binomial Models,Model, Binomial,Model, Polynomial,Model, Probabilistic,Model, Two-Parameter,Models, Probabilistic,Models, Two-Parameter,Polynomial Model,Polynomial Models,Probabilistic Model,Two Parameter Models,Two-Parameter Model
D016013 Likelihood Functions Functions constructed from a statistical model and a set of observed data which give the probability of that data for various values of the unknown model parameters. Those parameter values that maximize the probability are the maximum likelihood estimates of the parameters. Likelihood Ratio Test,Maximum Likelihood Estimates,Estimate, Maximum Likelihood,Estimates, Maximum Likelihood,Function, Likelihood,Functions, Likelihood,Likelihood Function,Maximum Likelihood Estimate,Test, Likelihood Ratio

Related Publications

Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
August 2000, Proteins,
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
August 2004, Current protein & peptide science,
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
May 2017, Bioinformatics (Oxford, England),
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
January 2022, Bioinformatics (Oxford, England),
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
December 1993, Journal of molecular biology,
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
January 2014, PloS one,
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
January 2008, Methods in molecular biology (Clifton, N.J.),
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
September 2005, Bioinformatics (Oxford, England),
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
January 2006, Bioinformatics (Oxford, England),
Wei Chu, and Zoubin Ghahramani, and Alexei Podtelezhnikov, and David L Wild
November 2000, Current protein & peptide science,
Copied contents to your clipboard!