Accurate prediction of protein structural classes using functional domains and predicted secondary structure sequences. 2012

Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
Department of Computer Science & Engineering, University of South Florida, Tampa, FL 33620, USA.

Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets.

UI MeSH Term Description Entries
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D000465 Algorithms A procedure consisting of a sequence of algebraic formulas and/or logical steps to calculate or determine a given task. Algorithm
D016415 Sequence Alignment The arrangement of two or more amino acid or base sequences from an organism or organisms in such a way as to align areas of the sequences sharing common properties. The degree of relatedness or homology between the sequences is predicted computationally or statistically based on weights assigned to the elements aligned between the sequences. This in turn can serve as a potential indicator of the genetic relatedness between the organisms. Sequence Homology Determination,Determination, Sequence Homology,Alignment, Sequence,Alignments, Sequence,Determinations, Sequence Homology,Sequence Alignments,Sequence Homology Determinations
D017433 Protein Structure, Secondary The level of protein structure in which regular hydrogen-bond interactions within contiguous stretches of polypeptide chain give rise to ALPHA-HELICES; BETA-STRANDS (which align to form BETA-SHEETS), or other types of coils. This is the first folding level of protein conformation. Secondary Protein Structure,Protein Structures, Secondary,Secondary Protein Structures,Structure, Secondary Protein,Structures, Secondary Protein
D020539 Sequence Analysis, Protein A process that includes the determination of AMINO ACID SEQUENCE of a protein (or peptide, oligopeptide or peptide fragment) and the information analysis of the sequence. Amino Acid Sequence Analysis,Peptide Sequence Analysis,Protein Sequence Analysis,Sequence Determination, Protein,Amino Acid Sequence Analyses,Amino Acid Sequence Determination,Amino Acid Sequence Determinations,Amino Acid Sequencing,Peptide Sequence Determination,Protein Sequencing,Sequence Analyses, Amino Acid,Sequence Analysis, Amino Acid,Sequence Analysis, Peptide,Sequence Determination, Amino Acid,Sequence Determinations, Amino Acid,Acid Sequencing, Amino,Analyses, Peptide Sequence,Analyses, Protein Sequence,Analysis, Peptide Sequence,Analysis, Protein Sequence,Peptide Sequence Analyses,Peptide Sequence Determinations,Protein Sequence Analyses,Protein Sequence Determination,Protein Sequence Determinations,Sequence Analyses, Peptide,Sequence Analyses, Protein,Sequence Determination, Peptide,Sequence Determinations, Peptide,Sequence Determinations, Protein,Sequencing, Amino Acid,Sequencing, Protein
D030562 Databases, Protein Databases containing information about PROTEINS such as AMINO ACID SEQUENCE; PROTEIN CONFORMATION; and other properties. Amino Acid Sequence Databases,Databases, Amino Acid Sequence,Protein Databases,Protein Sequence Databases,SWISS-PROT,Protein Structure Databases,SwissProt,Database, Protein,Database, Protein Sequence,Database, Protein Structure,Databases, Protein Sequence,Databases, Protein Structure,Protein Database,Protein Sequence Database,Protein Structure Database,SWISS PROT,Sequence Database, Protein,Sequence Databases, Protein,Structure Database, Protein,Structure Databases, Protein
D040901 Proteomics The systematic study of the complete complement of proteins (PROTEOME) of organisms. Peptidomics

Related Publications

Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
January 2010, BMC bioinformatics,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
May 2012, Biochimie,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
February 2014, Biochimie,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
April 1998, Protein engineering,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
December 2018, Molecules (Basel, Switzerland),
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
March 2014, Journal of theoretical biology,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
January 2015, Gene,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
April 2011, Biochimie,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
March 2010, Journal of computational biology : a journal of computational molecular cell biology,
Amin Ahmadi Adl, and Abbas Nowzari-Dalini, and Bin Xue, and Vladimir N Uversky, and Xiaoning Qian
April 2001, Journal of protein chemistry,
Copied contents to your clipboard!