In silico re-identification of properties of drug target proteins. 2017

Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro,Buk-gu, Gwangju, 61005, Republic of Korea.

BACKGROUND Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. METHODS Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. RESULTS We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. CONCLUSIONS When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.

UI MeSH Term Description Entries
D011499 Protein Processing, Post-Translational Any of various enzymatically catalyzed post-translational modifications of PEPTIDES or PROTEINS in the cell of origin. These modifications include carboxylation; HYDROXYLATION; ACETYLATION; PHOSPHORYLATION; METHYLATION; GLYCOSYLATION; ubiquitination; oxidation; proteolysis; and crosslinking and result in changes in molecular weight and electrophoretic motility. Amino Acid Modification, Post-Translational,Post-Translational Modification,Post-Translational Protein Modification,Posttranslational Modification,Protein Modification, Post-Translational,Amino Acid Modification, Posttranslational,Post-Translational Amino Acid Modification,Post-Translational Modifications,Post-Translational Protein Processing,Posttranslational Amino Acid Modification,Posttranslational Modifications,Posttranslational Protein Processing,Protein Processing, Post Translational,Protein Processing, Posttranslational,Amino Acid Modification, Post Translational,Modification, Post-Translational,Modification, Post-Translational Protein,Modification, Posttranslational,Modifications, Post-Translational,Modifications, Post-Translational Protein,Modifications, Posttranslational,Post Translational Amino Acid Modification,Post Translational Modification,Post Translational Modifications,Post Translational Protein Modification,Post Translational Protein Processing,Post-Translational Protein Modifications,Processing, Post-Translational Protein,Processing, Posttranslational Protein,Protein Modification, Post Translational,Protein Modifications, Post-Translational
D011506 Proteins Linear POLYPEPTIDES that are synthesized on RIBOSOMES and may be further modified, crosslinked, cleaved, or assembled into complex proteins with several subunits. The specific sequence of AMINO ACIDS determines the shape the polypeptide will take, during PROTEIN FOLDING, and the function of the protein. Gene Products, Protein,Gene Proteins,Protein,Protein Gene Products,Proteins, Gene
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D016208 Databases, Factual Extensive collections, reputedly complete, of facts and data garnered from material of a specialized subject area and made available for analysis and application. The collection can be automated by various contemporary methods for retrieval. The concept should be differentiated from DATABASES, BIBLIOGRAPHIC which is restricted to collections of bibliographic references. Databanks, Factual,Data Banks, Factual,Data Bases, Factual,Data Bank, Factual,Data Base, Factual,Databank, Factual,Database, Factual,Factual Data Bank,Factual Data Banks,Factual Data Base,Factual Data Bases,Factual Databank,Factual Databanks,Factual Database,Factual Databases
D060388 Support Vector Machine SUPERVISED MACHINE LEARNING algorithm which learns to assign labels to objects from a set of training examples. Examples are learning to recognize fraudulent credit card activity by examining hundreds or thousands of fraudulent and non-fraudulent credit card activity, or learning to make disease diagnosis or prognosis based on automatic classification of microarray gene expression profiles drawn from hundreds or thousands of samples. Support Vector Network,Machine, Support Vector,Machines, Support Vector,Network, Support Vector,Networks, Support Vector,Support Vector Machines,Support Vector Networks,Vector Machine, Support,Vector Machines, Support,Vector Network, Support,Vector Networks, Support
D023281 Genomics The systematic study of the complete DNA sequences (GENOME) of organisms. Included is construction of complete genetic, physical, and transcript maps, and the analysis of this structural genomic information on a global scale such as in GENOME WIDE ASSOCIATION STUDIES. Functional Genomics,Structural Genomics,Comparative Genomics,Genomics, Comparative,Genomics, Functional,Genomics, Structural

Related Publications

Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
July 2017, Molecular informatics,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
January 2019, Methods in molecular biology (Clifton, N.J.),
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
September 2013, Journal of the American Chemical Society,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
September 2021, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
January 2009, Current medicinal chemistry,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
January 2024, PloS one,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
January 2021, International journal of peptide research and therapeutics,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
July 2022, Pharmaceuticals (Basel, Switzerland),
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
December 2012, International journal for parasitology. Drugs and drug resistance,
Baeksoo Kim, and Jihoon Jo, and Jonghyun Han, and Chungoo Park, and Hyunju Lee
January 2016, International journal of genomics,
Copied contents to your clipboard!