Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data

Drug Discovery Today. 2021, 26, 4, 1040-1052.

Introduced in last month’s newsletter, part 1 of this review by Bender and Cortes-Ciriano summarised the challenges associated with utilising AI effectively in drug discovery. Part 2 offers a deeper analysis of the chemical and biological data domains and highlights the difficulties of building useful AI models for drug efficacy and safety on such complex data. The authors provide a thought-provoking comparison of the chemical and biological data domains with domains that are well-suited to AI; for example, image and speech where AI models have found particular success. While image and speech data can be empirically represented using pixels and waveforms, it is difficult to choose an appropriate representation of chemical and biological data, where the most suitable descriptors depend on the endpoint of interest. Furthermore, the vast and unknown distribution of chemical space makes it impossible to choose unbiased data points for model building. Additionally, the labelling of chemical and biological data presents a significant challenge due to the complexity of disease and the conditionality of biological measurements. For example, do we assign disease labels based on the underlying cause, the mechanism or the symptoms shown in the individual?

To conclude, the takeaway message from this review is for greater scrutiny of the ways in which the drug discovery community generate and record data. Instead of the current “technology push”, where the data collected depend on the available technology, we need to work towards “science pull”, where the scientific question is defined and experiments are designed to collect relevant data for model building.

Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data

Let’s Make Things Happen

Contact Info

Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 2: a discussion of chemical and biological data

Share This Article

Related Posts

Med Chem Paper of the Month – June 2026: Spiro-Azetidine Oxindoles as Long-Acting Injectable RSV Antivirals

Comp Chem Paper of the Month – June 2026: Binding Free Energy Calculations for Drug Discovery

An Automated Workflow for Diagnosing Sampling Issues Caused by Slow Torsional Motions in Molecular Simulations- Paper of the month May 2026

Let’s Make Things Happen

Contact Info