Novel Algorithms for the Identification of Biologically Informative Chemical Diversity Metrics. 2008

Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
Center of Excellence for Chemical Methodology and Library Development, The University of Kansas, Lawrence, KS 66045, USA.

Despite great advances in the efficiency of analytical and synthetic chemistry, time and available starting material still limit the number of unique compounds that can be practically synthesized and evaluated as prospective therapeutics. Chemical diversity analysis (the capacity to identify finite diverse subsets that reliably represent greater manifolds of drug-like chemicals) thus remains an important resource in drug discovery. Despite an unproven track record, chemical diversity has also been used to posit, from preliminary screen hits, new compounds with similar or better activity. Identifying diversity metrics that demonstrably encode bioactivity trends is thus of substantial potential value for intelligent assembly of targeted screens. This paper reports novel algorithms designed to simultaneously reflect chemical similarity or diversity trends and apparent bioactivity in compound collections. An extensive set of descriptors are evaluated within large NCI screening data sets according to bioactivity differentiation capacities, quantified as the ability to co-localize known active species into bioactive-rich K-means clusters. One method tested for descriptor selection orders features according to relative variance across a set of training compounds, and samples increasingly finer subset meshes for descriptors whose exclusion from the model induces drastic drops in relative bioactive colocalization. This yields metrics with reasonable bioactive enrichment (greater than 50% of all bioactive compounds collected into clusters or cells with significantly enriched active/inactive rates) for each of the four data sets examined herein. A second method replaces variance by an active/inactive divergence score, achieving comparable enrichment via a much more efficient search process. Combinations of the above metrics are tested in 2D rectilinear diversity models, achieving similarly successful colocalization statistics, with metrics derived from the active/inactive divergence score typically outperforming those selected from the variance criterion and computed from the DiverseSolutions software.

UI MeSH Term Description Entries

Related Publications

Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
April 2020, Neuroinformatics,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
May 2018, Gene,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
January 2000, Journal of molecular graphics & modelling,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
December 2016, IEEE transactions on cybernetics,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
August 2015, Proceedings of the National Academy of Sciences of the United States of America,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
February 2015, Blood cells, molecules & diseases,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
November 2005, Journal of computational biology : a journal of computational molecular cell biology,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
September 2016, Archives of pharmacal research,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
March 2009, Journal of neuroscience methods,
Bhargav Theertham, and Jenna L Wang, and Jianwen Fang, and Gerald H Lushington
January 2013, Journal of dental research,
Copied contents to your clipboard!