Can We Quickly Learn to "Translate" Bioactive Molecules with Transformer Models? 2023

Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
Machine Learning and Computational Sciences, Pfizer Worldwide Research, Development, and Medical, 610 Main Street, Cambridge, Massachusetts 02139, United States.

Meaningful exploration of the chemical space of druglike molecules in drug design is a highly challenging task due to a combinatorial explosion of possible modifications of molecules. In this work, we address this problem with transformer models, a type of machine learning (ML) model originally developed for machine translation. By training transformer models on pairs of similar bioactive molecules from the public ChEMBL data set, we enable them to learn medicinal-chemistry-meaningful, context-dependent transformations of molecules, including those absent from the training set. By retrospective analysis on the performance of transformer models on ChEMBL subsets of ligands binding to COX2, DRD2, or HERG protein targets, we demonstrate that the models can generate structures identical or highly similar to most active ligands, despite the models having not seen any ligands active against the corresponding protein target during training. Our work demonstrates that human experts working on hit expansion in drug design can easily and quickly employ transformer models, originally developed to translate texts from one natural language to another, to "translate" from known molecules active against a given protein target to novel molecules active against the same target.

UI MeSH Term Description Entries
D006801 Humans Members of the species Homo sapiens. Homo sapiens,Man (Taxonomy),Human,Man, Modern,Modern Man
D000069550 Machine Learning A type of ARTIFICIAL INTELLIGENCE that enable COMPUTERS to independently initiate and execute LEARNING when exposed to new data. Transfer Learning,Learning, Machine,Learning, Transfer
D012189 Retrospective Studies Studies used to test etiologic hypotheses in which inferences about an exposure to putative causal factors are derived from data relating to characteristics of persons under study or to events or experiences in their past. The essential feature is that some of the persons under study have the disease or outcome of interest and their characteristics are compared with those of unaffected persons. Retrospective Study,Studies, Retrospective,Study, Retrospective
D015195 Drug Design The molecular designing of drugs for specific purposes (such as DNA-binding, enzyme inhibition, anti-cancer efficacy, etc.) based on knowledge of molecular properties such as activity of functional groups, molecular geometry, and electronic structure, and also on information cataloged on analogous molecules. Drug design is generally computer-assisted molecular modeling and does not include PHARMACOKINETICS, dosage analysis, or drug administration analysis. Computer-Aided Drug Design,Computerized Drug Design,Drug Modeling,Pharmaceutical Design,Computer Aided Drug Design,Computer-Aided Drug Designs,Computerized Drug Designs,Design, Pharmaceutical,Drug Design, Computer-Aided,Drug Design, Computerized,Drug Designs,Drug Modelings,Pharmaceutical Designs

Related Publications

Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
January 2022, Advances in experimental medicine and biology,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
August 1998, Journal of medicinal chemistry,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
April 2004, Occupational health & safety (Waco, Tex.),
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
March 1999, Clinical and experimental allergy : journal of the British Society for Allergy and Clinical Immunology,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
October 2001, Trends in genetics : TIG,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
January 2002, Pancreatology : official journal of the International Association of Pancreatology (IAP) ... [et al.],
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
January 2014, Journal of Parkinson's disease,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
August 2016, Scientific American,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
April 2019, International journal of cardiology,
Emma P Tysinger, and Brajesh K Rai, and Anton V Sinitskiy
April 1999, The Journal of investigative dermatology,
Copied contents to your clipboard!