SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure. 2018

Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
Department of Computer Science, Morgan State University, Baltimore, MD 21251, USA. abdollah.dehzangi@morgan.edu.

Post Translational Modification (PTM) is defined as the modification of amino acids along the protein sequences after the translation process. These modifications significantly impact on the functioning of proteins. Therefore, having a comprehensive understanding of the underlying mechanism of PTMs turns out to be critical in studying the biological roles of proteins. Among a wide range of PTMs, sumoylation is one of the most important modifications due to its known cellular functions which include transcriptional regulation, protein stability, and protein subcellular localization. Despite its importance, determining sumoylation sites via experimental methods is time-consuming and costly. This has led to a great demand for the development of fast computational methods able to accurately determine sumoylation sites in proteins. In this study, we present a new machine learning-based method for predicting sumoylation sites called SumSec. To do this, we employed the predicted secondary structure of amino acids to extract two types of structural features from neighboring amino acids along the protein sequence which has never been used for this task. As a result, our proposed method is able to enhance the sumoylation site prediction task, outperforming previously proposed methods in the literature. SumSec demonstrated high sensitivity (0.91), accuracy (0.94) and MCC (0.88). The prediction accuracy achieved in this study is 21% better than those reported in previous studies. The script and extracted features are publicly available at: https://github.com/YosvanyLopez/SumSec.

UI MeSH Term Description Entries
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D001665 Binding Sites The parts of a macromolecule that directly participate in its specific combination with another molecule. Combining Site,Binding Site,Combining Sites,Site, Binding,Site, Combining,Sites, Binding,Sites, Combining
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D058207 Sumoylation A type of POST-TRANSLATIONAL PROTEIN MODIFICATION by SMALL UBIQUITIN-RELATED MODIFIER PROTEINS (also known as SUMO proteins). SUMO-Conjugation,SUMO Conjugation,SUMO-Conjugations,Sumoylations
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational
D020407 Internet A loose confederation of computer communication networks around the world. The networks that make up the Internet are connected through several backbone networks. The Internet grew out of the US Government ARPAnet project and was designed to facilitate information exchange. World Wide Web,Cyber Space,Cyberspace,Web, World Wide,Wide Web, World

Related Publications

Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
January 2012, Journal of biomolecular structure & dynamics,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
January 2019, Biochemical and biophysical research communications,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
March 2023, BMC bioinformatics,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
September 2008, BMC bioinformatics,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
January 2016, Gene,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
July 2011, Nucleic acids research,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
May 1999, Proteins,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
January 2022, Briefings in bioinformatics,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
January 2020, Proceedings of the National Academy of Sciences of the United States of America,
Abdollah Dehzangi, and Yosvany López, and Ghazaleh Taherzadeh, and Alok Sharma, and Tatsuhiko Tsunoda
June 1987, Journal of molecular biology,
Copied contents to your clipboard!