DeepGOPlus: improved protein function prediction from sequence. 2020

Maxat Kulmanov, and Robert Hoehndorf
Computational Bioscience Research Center, Computer, Electrical and Mathematical Sciences & Engineering Division, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.

Protein function prediction is one of the major tasks of bioinformatics that can help in wide range of biological problems such as understanding disease mechanisms or finding drug targets. Many methods are available for predicting protein functions from sequence based features, protein-protein interaction networks, protein structure or literature. However, other than sequence, most of the features are difficult to obtain or not available for many proteins thereby limiting their scope. Furthermore, the performance of sequence-based function prediction methods is often lower than methods that incorporate multiple features and predicting protein functions may require a lot of time. We developed a novel method for predicting protein functions from sequence alone which combines deep convolutional neural network (CNN) model with sequence similarity based predictions. Our CNN model scans the sequence for motifs which are predictive for protein functions and combines this with functions of similar proteins (if available). We evaluate the performance of DeepGOPlus using the CAFA3 evaluation measures and achieve an Fmax of 0.390, 0.557 and 0.614 for BPO, MFO and CCO evaluations, respectively. These results would have made DeepGOPlus one of the three best predictors in CCO and the second best performing method in the BPO and MFO evaluations. We also compare DeepGOPlus with state-of-the-art methods such as DeepText2GO and GOLabeler on another dataset. DeepGOPlus can annotate around 40 protein sequences per second on common hardware, thereby making fast and accurate function predictions available for a wide range of proteins. http://deepgoplus.bio2vec.net/ . Supplementary data are available at Bioinformatics online.

UI MeSH Term Description Entries
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D000595 Amino Acid Sequence The order of amino acids as they occur in a polypeptide chain. This is referred to as the primary structure of proteins. It is of fundamental importance in determining PROTEIN CONFORMATION. Protein Structure, Primary,Amino Acid Sequences,Sequence, Amino Acid,Sequences, Amino Acid,Primary Protein Structure,Primary Protein Structures,Protein Structures, Primary,Structure, Primary Protein,Structures, Primary Protein
D016571 Neural Networks, Computer A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming. Computational Neural Networks,Connectionist Models,Models, Neural Network,Neural Network Models,Neural Networks (Computer),Perceptrons,Computational Neural Network,Computer Neural Network,Computer Neural Networks,Connectionist Model,Model, Connectionist,Model, Neural Network,Models, Connectionist,Network Model, Neural,Network Models, Neural,Network, Computational Neural,Network, Computer Neural,Network, Neural (Computer),Networks, Computational Neural,Networks, Computer Neural,Networks, Neural (Computer),Neural Network (Computer),Neural Network Model,Neural Network, Computational,Neural Network, Computer,Neural Networks, Computational,Perceptron
D060066 Protein Interaction Maps Graphs representing sets of measurable, non-covalent physical contacts with specific PROTEINS in living organisms or in cells. Protein-Protein Interaction Map,Protein-Protein Interaction Network,Protein Interaction Networks,Interaction Map, Protein,Interaction Map, Protein-Protein,Interaction Network, Protein,Interaction Network, Protein-Protein,Map, Protein Interaction,Map, Protein-Protein Interaction,Network, Protein Interaction,Network, Protein-Protein Interaction,Protein Interaction Map,Protein Interaction Network,Protein Protein Interaction Map,Protein Protein Interaction Network,Protein-Protein Interaction Maps,Protein-Protein Interaction Networks

Related Publications

Maxat Kulmanov, and Robert Hoehndorf
July 2001, Protein engineering,
Maxat Kulmanov, and Robert Hoehndorf
August 2003, Quarterly reviews of biophysics,
Maxat Kulmanov, and Robert Hoehndorf
February 2019, Bioinformatics (Oxford, England),
Maxat Kulmanov, and Robert Hoehndorf
January 2006, Current pharmaceutical design,
Maxat Kulmanov, and Robert Hoehndorf
May 2001, Current opinion in drug discovery & development,
Maxat Kulmanov, and Robert Hoehndorf
July 2011, Proteins,
Maxat Kulmanov, and Robert Hoehndorf
June 2009, Bioinformatics (Oxford, England),
Maxat Kulmanov, and Robert Hoehndorf
February 2006, Bioinformatics (Oxford, England),
Maxat Kulmanov, and Robert Hoehndorf
January 2019, PeerJ,
Copied contents to your clipboard!