SAAMBE-SEQ: a sequence-based method for predicting mutation effect on protein-protein binding affinity. 2021

Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
Department of Physics and Astronomy, Clemson University, Clemson, SC 29634, USA.

Vast majority of human genetic disorders are associated with mutations that affect protein-protein interactions by altering wild-type binding affinity. Therefore, it is extremely important to assess the effect of mutations on protein-protein binding free energy to assist the development of therapeutic solutions. Currently, the most popular approaches use structural information to deliver the predictions, which precludes them to be applicable on genome-scale investigations. Indeed, with the progress of genomic sequencing, researchers are frequently dealing with assessing effect of mutations for which there is no structure available. Here, we report a Gradient Boosting Decision Tree machine learning algorithm, the SAAMBE-SEQ, which is completely sequence-based and does not require structural information at all. SAAMBE-SEQ utilizes 80 features representing evolutionary information, sequence-based features and change of physical properties upon mutation at the mutation site. The approach is shown to achieve Pearson correlation coefficient (PCC) of 0.83 in 5-fold cross validation in a benchmarking test against experimentally determined binding free energy change (ΔΔG). Further, a blind test (no-STRUC) is compiled collecting experimental ΔΔG upon mutation for protein complexes for which structure is not available and used to benchmark SAAMBE-SEQ resulting in PCC in the range of 0.37-0.46. The accuracy of SAAMBE-SEQ method is found to be either better or comparable to most advanced structure-based methods. SAAMBE-SEQ is very fast, available as webserver and stand-alone code, and indeed utilizes only sequence information, and thus it is applicable for genome-scale investigations to study the effect of mutations on protein-protein interactions. SAAMBE-SEQ is available at http://compbio.clemson.edu/saambe_webserver/indexSEQ.php#started. Supplementary data are available at Bioinformatics online.

UI MeSH Term Description Entries
D009154 Mutation Any detectable and heritable change in the genetic material that causes a change in the GENOTYPE and which is transmitted to daughter cells and to succeeding generations. Mutations
D011485 Protein Binding The process in which substances, either endogenous or exogenous, bind to proteins, peptides, enzymes, protein precursors, or allied compounds. Specific protein-binding measures are often used as assays in diagnostic assessments. Plasma Protein Binding Capacity,Binding, Protein
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D012984 Software Sequential operating programs and data which instruct the functioning of a digital computer. Computer Programs,Computer Software,Open Source Software,Software Engineering,Software Tools,Computer Applications Software,Computer Programs and Programming,Computer Software Applications,Application, Computer Software,Applications Software, Computer,Applications Softwares, Computer,Applications, Computer Software,Computer Applications Softwares,Computer Program,Computer Software Application,Engineering, Software,Open Source Softwares,Program, Computer,Programs, Computer,Software Application, Computer,Software Applications, Computer,Software Tool,Software, Computer,Software, Computer Applications,Software, Open Source,Softwares, Computer Applications,Softwares, Open Source,Source Software, Open,Source Softwares, Open,Tool, Software,Tools, Software

Related Publications

Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
January 2021, International journal of molecular sciences,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
May 2024, Bioinformatics (Oxford, England),
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
November 2023, Biochimica et biophysica acta. Proteins and proteomics,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
January 2022, IEEE/ACM transactions on computational biology and bioinformatics,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
January 2016, Methods (San Diego, Calif.),
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
May 2001, Journal of computer-aided molecular design,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
March 2024, ACS omega,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
June 2019, Nature methods,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
January 2012, PloS one,
Gen Li, and Swagata Pahari, and Adithya Krishna Murthy, and Siqi Liang, and Robert Fragoza, and Haiyuan Yu, and Emil Alexov
December 2016, Interdisciplinary sciences, computational life sciences,
Copied contents to your clipboard!