SAAFEC-SEQ: A Sequence-Based Method for Predicting the Effect of Single Point Mutations on Protein Thermodynamic Stability. 2021

Gen Li, and Shailesh Kumar Panday, and Emil Alexov
Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.

Modeling the effect of mutations on protein thermodynamics stability is useful for protein engineering and understanding molecular mechanisms of disease-causing variants. Here, we report a new development of the SAAFEC method, the SAAFEC-SEQ, which is a gradient boosting decision tree machine learning method to predict the change of the folding free energy caused by amino acid substitutions. The method does not require the 3D structure of the corresponding protein, but only its sequence and, thus, can be applied on genome-scale investigations where structural information is very sparse. SAAFEC-SEQ uses physicochemical properties, sequence features, and evolutionary information features to make the predictions. It is shown to consistently outperform all existing state-of-the-art sequence-based methods in both the Pearson correlation coefficient and root-mean-squared-error parameters as benchmarked on several independent datasets. The SAAFEC-SEQ has been implemented into a web server and is available as stand-alone code that can be downloaded and embedded into other researchers' code.

UI MeSH Term Description Entries
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software
D013816 Thermodynamics A rigorously mathematical analysis of energy relationships (heat, work, temperature, and equilibrium). It describes systems whose states are determined by thermal parameters, such as temperature, in addition to mechanical and electromagnetic parameters. (From Hawley's Condensed Chemical Dictionary, 12th ed) Thermodynamic
D017354 Point Mutation A mutation caused by the substitution of one nucleotide for another. This results in the DNA molecule having a change in a single base pair. Mutation, Point,Mutations, Point,Point Mutations
D055550 Protein Stability The ability of a protein to retain its structural conformation or its activity when subjected to physical or chemical manipulations. Protein Stabilities,Stabilities, Protein,Stability, Protein
D019943 Amino Acid Substitution The naturally occurring or experimentally induced replacement of one or more AMINO ACIDS in a protein with another. If a functionally equivalent amino acid is substituted, the protein may retain wild-type activity. Substitution may also diminish, enhance, or eliminate protein function. Experimentally induced substitution is often used to study enzyme activities and binding site properties. Amino Acid Substitutions,Substitution, Amino Acid,Substitutions, Amino Acid

Related Publications

Gen Li, and Shailesh Kumar Panday, and Emil Alexov
August 2004, Bioinformatics (Oxford, England),
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
May 2021, Bioinformatics (Oxford, England),
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
June 2020, Journal of chemical information and modeling,
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
March 2019, The journal of physical chemistry. B,
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
January 2018, Molecules (Basel, Switzerland),
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
June 2024, Structure (London, England : 1993),
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
July 2023, International journal of molecular sciences,
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
March 2008, BMC bioinformatics,
Gen Li, and Shailesh Kumar Panday, and Emil Alexov
August 2021, BMC bioinformatics,
Copied contents to your clipboard!