An Integrated-OFFT Model for the Prediction of Protein Secondary Structure Class. 2019

Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
Department of Computer Science and Engineering, Institute of Technical Education and Research, Siksha 'O' Anusandhan University, Bhubaneswar, Orissa, India.

BACKGROUND Proteins are the utmost multi-purpose macromolecules, which play a crucial function in many aspects of biological processes. For a long time, sequence arrangement of amino acid has been utilized for the prediction of protein secondary structure. Besides, in major methods for the prediction of protein secondary structure class, the impact of Gaussian noise on sequence representation of amino acids has not been considered until now; which is one of the important constraints for the functionality of a protein. METHODS In the present research, the prediction of protein secondary structure class was accomplished by integrated application of Stockwell transformation and Amino Acid Composition (AAC), on equivalent Electron-ion Interaction Potential (EIIP) representation of raw amino acid sequence. The introduced method was evaluated by using 4 benchmark datasets of low sequence homology, namely PDB25, 498, 277, and 204. Furthermore, random forest algorithm together with the out-of-bag error estimate and Support Vector Machine (SVM), using k-fold cross validation demonstrated high feature representation potential of our reported approach. RESULTS The overall prediction accuracy for PDB25, 498, 277, and 204 datasets with randomforest classifier was 92.5%, 94.79%, 92.45%, and 88.04% respectively, whereas with SVM, the results were 84.66%, 95.32%, 89.29%, and 84.37% respectively. CONCLUSIONS An integrated-order-function-frequency-time (OFFT) model has been proposed for the prediction of protein secondary structure class. For the first time, we reported the effect of Gaussian noise on the prediction accuracy of protein secondary structure class and proposed a robust integrated- OFFT model, which is effectively noise resistant.

UI MeSH Term Description Entries
D008956 Models, Chemical Theoretical representations that simulate the behavior or activity of chemical processes or phenomena; includes the use of mathematical equations, computers, and other electronic equipment. Chemical Models,Chemical Model,Model, Chemical
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational
D030562 Databases, Protein Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties. Amino Acid Sequence Databases,Databases, Amino Acid Sequence,Protein Databases,Protein Sequence Databases,SWISS-PROT,Protein Structure Databases,SwissProt,Database, Protein,Database, Protein Sequence,Database, Protein Structure,Databases, Protein Sequence,Databases, Protein Structure,Protein Database,Protein Sequence Database,Protein Structure Database,SWISS PROT,Sequence Database, Protein,Sequence Databases, Protein,Structure Database, Protein,Structure Databases, Protein

Related Publications

Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
January 1987, Protein engineering,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
January 2003, Genome informatics. International Conference on Genome Informatics,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
January 2014, PloS one,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
October 2011, Proteomics,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
January 2022, Frontiers in genetics,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
October 2009, Mathematical biosciences,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
April 1993, Computer applications in the biosciences : CABIOS,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
January 2007, International journal of data mining and bioinformatics,
Bishnupriya Panda, and Babita Majhi, and Abhimanyu Thakur
September 2007, BMC bioinformatics,
Copied contents to your clipboard!