Five years after Roughley and Jordan’s seminal approach looking at medicinal chemists preferred reactions by hand, the folks at NextMove and Novartis used automated natural language processing to analyse >200,000 patents and extracted over 1.1 million unique reactions. Using the Roughley and Jordan reaction typing they then classified the reactions. With this much larger data set they could analyse the evolution of reaction types, for example with carbon-carbon bond formations they see the switching from phosphorus ylid to palladium catalysed cross couplings as the Suzuki and Negishi reactions have been applied in drug hunting research. Still however alkylation and acylation of heteroatoms remains a key process. They also analysed the properties of the products of reactions where unsurprisingly compounds grow in size and rigidity over the 40 year period reviewed.
This scale of work would never have been possible without automation and now it’s hard to see why anyone would ever go back.
“Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists’ Bread and Butter” by Schneider, Lowe, Sayle, Tarselli & Landrum
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe