BucketListPapers 91/100 - How To Assess Uncertainty In A Given Data Set.

Kramer uncertainty

As the abstract says “The maximum achievable accuracy of in silico models depends on the quality of the experimental data.” The paper describes beautifully, methods to examine a given dataset and so “defines a natural upper limit to the predictive performance possible.” This #BucketListPapers uses publicly available data from ChEMBL and Figure 5 illustrates the spread in the data for pairs for measurements for a given molecule – the 2.5 log lines really show how much measurements can vary. An appreciation of this degree in variation is important when building ML models for drug discovery.

Kramer : The Experimental Uncertainty of Heterogeneous Public Ki Data.

J. Med. Chem. 2012, 55, 5165-5173

A great follow up, by the same author, is the impact of uncertainty on Matched Molecular Pair Analyis.

Matched Molecular Pair Analysis: Significance and the Impact of Experimental Uncertainty.

J.Med. Chem. 2014, 57, 9, 3786–3802.

#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe

BucketListPapers 91/100 – How to Assess Uncertainty in a Given Data Set.