Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery

Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery

This open access hands-on perspective offers best-practice recommendations for rigorous method comparison in the domain of ML-based small-molecule property prediction. Given the burgeoning nature of the subject matter, the authors fulfil a contemporary community need in provisioning, by example, a state-of-the-art resource and “helper” software for rigorous model comparison – with a focus on statistical replicability.

When building or applying models and considering the perpetual question of “how are we doing?”, this article furnishes practitioners and decision-makers with robust experimental analysis techniques, towards precluding future studies based on “fragile assumptions” and reversing the evidence-based “replicability crisis” manifest in contemporary ML-based research.

While the headline method comparison “guidelines” are extremely useful and succinctly summarised in Figure 1, great care is taken to dissuade from “blindly” following this recipe! Technically oriented readers will appreciate the accompanying annotated python code supporting the recommendations, the signposting of useful open-source software in Table 2, plus the noteworthy Supporting Information explainers. Importantly, all data sets used were derived from freely available public sources. These practically-minded aspects of the article are key to augmenting accessibility of the recommended protocols and should facilitate their adoption.

Overall, the article succeeds in setting out best-practice and strongly encourages the community to perform increasingly rigorous computational experiments and comparisons thereof, a trajectory that can only benefit future ML-based small molecule drug discovery efforts!

Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery:

J. Chem. Inf. Model. 2025, 65, 18, 9398–9411