Biomaterial Database

Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction. 2020

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States.

Advances in deep neural network (DNN)-based molecular property prediction have recently led to the development of models of remarkable accuracy and generalization ability, with graph convolutional neural networks (GCNNs) reporting state-of-the-art performance for this task. However, some challenges remain, and one of the most important that needs to be fully addressed concerns uncertainty quantification. DNN performance is affected by the volume and the quality of the training samples. Therefore, establishing when and to what extent a prediction can be considered reliable is just as important as outputting accurate predictions, especially when out-of-domain molecules are targeted. Recently, several methods to account for uncertainty in DNNs have been proposed, most of which are based on approximate Bayesian inference. Among these, only a few scale to the large data sets required in applications. Evaluating and comparing these methods has recently attracted great interest, but results are generally fragmented and absent for molecular property prediction. In this paper, we quantitatively compare scalable techniques for uncertainty estimation in GCNNs. We introduce a set of quantitative criteria to capture different uncertainty aspects and then use these criteria to compare MC-dropout, Deep Ensembles, and bootstrapping, both theoretically in a unified framework that separates aleatoric/epistemic uncertainty and experimentally on public data sets. Our experiments quantify the performance of the different uncertainty estimation methods and their impact on uncertainty-related error reduction. Our findings indicate that Deep Ensembles and bootstrapping consistently outperform MC-dropout, with different context-specific pros and cons. Our analysis leads to a better understanding of the role of aleatoric/epistemic uncertainty, also in relation to the target data set features, and highlights the challenge posed by out-of-domain uncertainty.

UI	MeSH Term	Description	Entries
D000077321	Deep Learning	Supervised or unsupervised machine learning methods that use multiple layers of data representations generated by nonlinear transformations, instead of individual task-specific ALGORITHMS, to build and train neural network models.	Hierarchical Learning,Learning, Deep,Learning, Hierarchical
D001499	Bayes Theorem	A theorem in probability theory named for Thomas Bayes (1702-1761). In epidemiology, it is used to obtain the probability of disease in a group of people with some characteristic on the basis of the overall rate of that disease and of the likelihood of that characteristic in healthy and diseased individuals. The most familiar application is in clinical decision analysis where it is used for estimating the probability of a particular diagnosis given the appearance of some symptoms or test result.	Bayesian Analysis,Bayesian Estimation,Bayesian Forecast,Bayesian Method,Bayesian Prediction,Analysis, Bayesian,Bayesian Approach,Approach, Bayesian,Approachs, Bayesian,Bayesian Approachs,Estimation, Bayesian,Forecast, Bayesian,Method, Bayesian,Prediction, Bayesian,Theorem, Bayes
D016571	Neural Networks, Computer	A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming.	Computational Neural Networks,Connectionist Models,Models, Neural Network,Neural Network Models,Neural Networks (Computer),Perceptrons,Computational Neural Network,Computer Neural Network,Computer Neural Networks,Connectionist Model,Model, Connectionist,Model, Neural Network,Models, Connectionist,Network Model, Neural,Network Models, Neural,Network, Computational Neural,Network, Computer Neural,Network, Neural (Computer),Networks, Computational Neural,Networks, Computer Neural,Networks, Neural (Computer),Neural Network (Computer),Neural Network Model,Neural Network, Computational,Neural Network, Computer,Neural Networks, Computational,Perceptron
D035501	Uncertainty	The condition in which reasonable knowledge regarding risks, benefits, or the future is not available.

Related Publications

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Explainable uncertainty quantifications for deep learning-based molecular property prediction.

February 2023, Journal of cheminformatics,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Evaluating uncertainty-based active learning for accelerating the generalization of molecular property prediction.

November 2023, Journal of cheminformatics,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Deep learning methods for molecular representation and property prediction.

December 2022, Drug discovery today,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Evidential Deep Learning for Guided Molecular Property Prediction and Discovery.

August 2021, ACS central science,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Deep contrastive learning of molecular conformation for efficient property prediction.

December 2023, Nature computational science,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Reliable Estimation of Prediction Uncertainty for Physicochemical Property Models.

July 2017, Journal of chemical theory and computation,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Calibrating ensembles for scalable uncertainty quantification in deep learning-based medical image segmentation.

September 2023, Computers in biology and medicine,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

FP-GNN: a versatile deep learning architecture for enhanced molecular property prediction.

November 2022, Briefings in bioinformatics,

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Molecular property prediction based on graph structure learning.

May 2024, Bioinformatics (Oxford, England),

Gabriele Scalia, and Colin A Grambow, and Barbara Pernici, and Yi-Pei Li, and William H Green

Uncertainty Quantification Using Neural Networks for Molecular Property Prediction.

August 2020, Journal of chemical information and modeling,

Evaluating Scalable Uncertainty Estimation Methods for Deep Learning-Based Molecular Property Prediction. 2020

Related Publications

SEARCH

RESOURCES

HELP

BIOMATERIAL MARKETPLACE

Selection Actions

Need Help?