Sequence-based prediction of DNA-binding residues in proteins with conservation and correlation information. 2012

Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
State Key Laboratory of Bioelectronics, School of Biological Science & Medical Engineering, Southeast University and Nanjing Audit University, Nanjing, P.R. China. maxin@seu.edu.cn

The recognition of DNA-binding residues in proteins is critical to our understanding of the mechanisms of DNA-protein interactions, gene expression, and for guiding drug design. Therefore, a prediction method DNABR (DNA Binding Residues) is proposed for predicting DNA-binding residues in protein sequences using the random forest (RF) classifier with sequence-based features. Two types of novel sequence features are proposed in this study, which reflect the information about the conservation of physicochemical properties of the amino acids, and the correlation of amino acids between different sequence positions in terms of physicochemical properties. The first type of feature uses the evolutionary information combined with the conservation of physicochemical properties of the amino acids while the second reflects the dependency effect of amino acids with regards to polarity charge and hydrophobic properties in the protein sequences. Those two features and an orthogonal binary vector which reflect the characteristics of 20 types of amino acids are used to build the DNABR, a model to predict DNA-binding residues in proteins. The DNABR model achieves a value of 0.6586 for Matthew’s correlation coefficient (MCC) and 93.04 percent overall accuracy (ACC) with a68.47 percent sensitivity (SE) and 98.16 percent specificity (SP), respectively. The comparisons with each feature demonstrate that these two novel features contribute most to the improvement in predictive ability. Furthermore, performance comparisons with other approaches clearly show that DNABR has an excellent prediction performance for detecting binding residues in putative DNA-binding protein. The DNABR web-server system is freely available at http://www.cbi.seu.edu.cn/DNABR/.

UI MeSH Term Description Entries
D008958 Models, Molecular Models used experimentally or theoretically to study molecular shape, electronic properties, or interactions; includes analogous molecules, computer-generated graphics, and mechanical structures. Molecular Models,Model, Molecular,Molecular Model
D011485 Protein Binding The process in which substances, either endogenous or exogenous, bind to proteins, peptides, enzymes, protein precursors, or allied compounds. Specific protein-binding measures are often used as assays in diagnostic assessments. Plasma Protein Binding Capacity,Binding, Protein
D004247 DNA A deoxyribonucleotide polymer that is the primary genetic material of all cells. Eukaryotic and prokaryotic organisms normally contain DNA in a double-stranded state, yet several important biological processes transiently involve single-stranded regions. DNA, which consists of a polysugar-phosphate backbone possessing projections of purines (adenine and guanine) and pyrimidines (thymine and cytosine), forms a double helix that is held together by hydrogen bonds between these purines and pyrimidines (adenine to thymine and guanine to cytosine). DNA, Double-Stranded,Deoxyribonucleic Acid,ds-DNA,DNA, Double Stranded,Double-Stranded DNA,ds DNA
D004268 DNA-Binding Proteins Proteins which bind to DNA. The family includes proteins which bind to both double- and single-stranded DNA and also includes specific DNA binding proteins in serum which can be used as markers for malignant diseases. DNA Helix Destabilizing Proteins,DNA-Binding Protein,Single-Stranded DNA Binding Proteins,DNA Binding Protein,DNA Single-Stranded Binding Protein,SS DNA BP,Single-Stranded DNA-Binding Protein,Binding Protein, DNA,DNA Binding Proteins,DNA Single Stranded Binding Protein,DNA-Binding Protein, Single-Stranded,Protein, DNA-Binding,Single Stranded DNA Binding Protein,Single Stranded DNA Binding Proteins
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000596 Amino Acids Organic compounds that generally contain an amino (-NH2) and a carboxyl (-COOH) group. Twenty alpha-amino acids are the subunits which are polymerized to form proteins. Amino Acid,Acid, Amino,Acids, Amino
D012372 ROC Curve A graphic means for assessing the ability of a screening test to discriminate between healthy and diseased persons; may also be used in other studies, e.g., distinguishing stimuli responses as to a faint stimuli or nonstimuli. ROC Analysis,Receiver Operating Characteristic,Analysis, ROC,Analyses, ROC,Characteristic, Receiver Operating,Characteristics, Receiver Operating,Curve, ROC,Curves, ROC,ROC Analyses,ROC Curves,Receiver Operating Characteristics
D015233 Models, Statistical Statistical formulations or analyses which, when applied to data and found to fit the data, are then used to verify the assumptions and parameters used in the analysis. Examples of statistical models are the linear model, binomial model, polynomial model, two-parameter model, etc. Probabilistic Models,Statistical Models,Two-Parameter Models,Model, Statistical,Models, Binomial,Models, Polynomial,Statistical Model,Binomial Model,Binomial Models,Model, Binomial,Model, Polynomial,Model, Probabilistic,Model, Two-Parameter,Models, Probabilistic,Models, Two-Parameter,Polynomial Model,Polynomial Models,Probabilistic Model,Two Parameter Models,Two-Parameter Model
D060388 Support Vector Machine SUPERVISED MACHINE LEARNING algorithm which learns to assign labels to objects from a set of training examples. Examples are learning to recognize fraudulent credit card activity by examining hundreds or thousands of fraudulent and non-fraudulent credit card activity, or learning to make disease diagnosis or prognosis based on automatic classification of microarray gene expression profiles drawn from hundreds or thousands of samples. Support Vector Network,Machine, Support Vector,Machines, Support Vector,Network, Support Vector,Networks, Support Vector,Support Vector Machines,Support Vector Networks,Vector Machine, Support,Vector Machines, Support,Vector Network, Support,Vector Networks, Support
D019295 Computational Biology A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories for solving biological problems including manipulation of models and datasets. Bioinformatics,Molecular Biology, Computational,Bio-Informatics,Biology, Computational,Computational Molecular Biology,Bio Informatics,Bio-Informatic,Bioinformatic,Biologies, Computational Molecular,Biology, Computational Molecular,Computational Molecular Biologies,Molecular Biologies, Computational

Related Publications

Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
March 2004, Bioinformatics (Oxford, England),
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
January 2017, Methods in molecular biology (Clifton, N.J.),
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
March 2007, Bioinformatics (Oxford, England),
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
January 2005, Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual Conference,
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
January 2011, Journal of molecular recognition : JMR,
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
July 2009, BMC genomics,
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
July 2007, Bioinformatics (Oxford, England),
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
December 2010, BMC genomics,
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
December 2006, Journal of bioinformatics and computational biology,
Xin Ma, and Jing Guo, and Hong-De Liu, and Jian-Ming Xie, and Xiao Sun
October 2011, Proteome science,
Copied contents to your clipboard!