Machine Learning Strategies for Reaction Development: Toward the Low-Data Limit

Navigating chemical space

This useful perspective article discusses many recent attempts at applying machine learning strategies (particularly transfer and active learning) to reaction improvement and discovery. The authors present many of the examples where researchers have sought to apply these methods and achieve a balanced perspective on the potential value of the methods and a sensible consideration of the challenges. Their concluding section provides one of the better descriptions of this application that I have seen. In particular, they consider the comparison between the machine learning approach and that undertaken by humans based on intuition and general chemical concepts.

There is an under-appreciated problem that generating the type and volume of data required for good machine learning can come at very considerable cost. Nonetheless, it is clear that incremental progress is being made towards understanding when transfer learning might be effective and when it might cause negative transfer. They mention the justified criticism that standard human optimisation can often identify local optima whereas machines have a better chance of finding the overall best conditions. It is great to see an article that reflects the problem that many of the datasets that have been generated so far in chemical history have not been designed with machine learning in mind; it is highly likely that reactions performed with this consideration in place will not only prove to be more valuable for computer-based understanding but will also improve what we humans can learn from them.