Escape from flatland 2: complexity and promiscuity.
Lovering. Med. Chem. Commun., 2013,4, 515-519
There are certain papers in this set that we believe everyone should read but for reasons that are not entirely positive. The follow up paper to “Escape from flatland” is one of those. This paper is “Escape from flatland 2: complexity and promiscuity”.
This is the first of a group of three papers in the bucket list that focus on bad behaviour by those analysing data – one paper illustrating some of the problems and two providing some guidance. It is very easy to fall into lots of traps when analysing large datasets and the reader will doubtless be able to trawl our own papers to find examples of bad practice! The key thing is that amongst the bucket list papers are several that should help you avoid many of the traps.
In flatland, complexity is defined as the fraction of sp3 carbons and the number of chiral centres – a rather limited conception of a concept that I am sure we could argue about for years. As for promiscuity, it turns out in flatland this is the number of assays in which a compound achieves inhibition greater than 50 % at a concentration of 10 µM in a 15 assay panel (a parallel analysis looks at a panel of CYP enzymes). This set of 15 assays is described as being a subset of the Cerep panel selected by a panel of “internal scientists”. Readers should already be concerned that a compound that achieves 49 % inhibition in 15 panels would score a promiscuity of 0 while one with 51 % inhibition in 1 would be infinitely more promiscuous – more on this in future bucket list papers. There is also the worrying problem that there will be plenty of compounds that are in the set which are not soluble at 10 µM.
As shown in the graph, this measure of promiscuity can be plotted against binned values of the fraction of sp3 carbon atoms. Both axes suffer from seen and unseen lines drawn in the sand in order to categorise continuous data. The dataset is not made available for readers to judge how strong the illustrated trends actually are – it is perfectly possible that such trends can be completely changed by moving the dividing lines between categories. In the graph, red objects are for “aminergic” compounds (containing amines) while blue are others. The author is trying to suggest that increasing “complexity” leads in general to a decrease in “promiscuity”. It is hard to know how much emphasis to give to this suggestion or to be able to translate it readily to helping solve problems in drug discovery projects. Even given the trend, the average promiscuity at the top of the peak is just about 0.3 suggesting that about 5 out of the 15 assays are hit – it is hard to know how promiscuous this really is without even knowing what the 15 assays are.