The MedChemica Bucket List.
The MedChemica Bucket List
BucketListPapers 100/100: Everything You Have Read Maybe Untrue…
We’ve made it to the last paper in our #BucketListPapers series, and for that reason we are now going to tell you everything you have read is false. Not quite, but there are concerns that a large majority of the findings in modern research are false, therefore, when reading papers there must be an element of caution to check that the research is credible.
Ioannidis constructs a statistical model to indicate whether findings within a paper are a false positive result through different assumptions. It is important to note that “negative” research is sometimes just as important as research that works. Even though there seems to be less of a drive to publish papers when “negative” research occurs. These papers were not included in this research as only ones where there were claims of relationships were considered.
One of the main contributors is bias, whether willing or unknown. These biases can include the way in which the data and analysis has been handled and reported.
Six corollaries were stated that could impact the reliability of published research. These state that:
- smaller studies findings are less likely to be true
- the smaller the relationship, the less likely the findings are to be true
- the greater the number and the lesser the selection of tested relationships in the field, the less likely the findings are to be true
- the greater the flexibility in the whole study and analysis, the less likely the findings are to be true
- the greater the financial, other interests and prejudices in a field, the less likely the findings are to be true
- the hotter a scientific field (with more scientific teams involved), the less likely the findings are to be true
Even though there will never be a gold standard to publishing research findings Ioannidis outlines that we all must do more in order to improve the situation. The first is that we should have larger studies with low bias. The second is that each research team addressing the same research question should be considered equal and one should not be given more significance than another, as all evidence is equal. The final one is that we should improve our understanding of R values to stop chasing statistical significance and pre-studies consider what the chance are that a true relationship is being tested or a non-true relationship. Additionally, in order to prevent false findings being published all research results should be reproducible.
Why Most Published Research Findings Are False.
John P. A. Ioannidis
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 99/100: The importance of validating statistical methods against real-world data
In this #BucketListPapers, software used for statistical analysis of fMRI data, specifically those that are designed to correct for familywise error (FWE), are validated against real-world, resting-state fMRI data from healthy controls. FWE-correction is important in experiments where multiple measurements are taken in parallel and combined to infer a statistical signal, as the statistical error is amplified leading to a higher overall false positive rate. Eklund et al. compare the FWE rates for three standard fMRI analysis software packages (SPM, FSL and AFNI), using a variety of commonly-employed methods and parameters, and identified up to 70 % false positive rates. Importantly, the paper concludes by recommending that the fMRI community focuses on validating existing statistical analysis methods and highlights the importance of real-world data sharing in efforts to do this; these are considerations that apply in many areas of the life-sciences.
Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates.
PNAS, 2016, 113(28), 7900-7905
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 98/100: Walking with Purpose
A drug discovery project has various twist and turns in its iterative design and optimisation process that can cover various areas of chemical space. Understanding how these projects evolve overtime could be crucial in the development of new projects. It is also important to understand why projects fail and the parameters set that could effect the drug discovery projects and understanding the end-goal of the project. #BucketListPapers
Delaney created a model that attempts to replicate the process through a self-avoiding walk (SAW). As overall researchers are a lot better at documenting successes rather than failures it makes it difficult to understand the difference in projects that work and don’t. The parameters of the modelled project can be adapted and explored. The model trajectories can be modelled on Sammon plots and these are similar to real life projects. The figure below is an example of a random walk generated by Delaney. The green circle is the starting point and the red circle is the end point, in this instance after 1000 steps. The start of the sequence is green, the mid section is blue and the end steps of the sequence is in red. Delaney analysed the number of successes and failures of each SAW project, where a success is a project that reaches it’s specific target and a failure is one where the SAW terminates before reaching the projects target. This analysis showed that allowing projects to run for longer increases the number of successes per project. Additionally, when altering the departmental parameters they have little effect on the number of success per project but did have an impact on the number of projects per step. Self-avoiding walks can, therefore, be used to predict how a drug-discovery project would react with certain constraints to understand how potentially success or not it maybe.
Delaney: Modelling Iterative Compound Optimisation using a Self-Avoiding Walk
Drug Discov Today. 2009 Feb;14(3-4):198-207
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 97/100: Still relevant today…
From the title of this #BucketListPapers you would think it was just out. Given the resurgence of deep learning methods to devises synthetic routes you might expect this, but it was published in 1969; a couple of months after the Moon landings! The work is famous in organic chemistry, and E.J.Corey won a Nobel prize. Although you may have been taught the methods (and read the books) you may not have read the paper ‘Computer-Assisted Design of Complex Organic Synthese’** The paper is long and describes in detail many of the chem-infomatics issues that come around when writing a piece of software for use by chemists. All through the paper you will keep saying to yourself ‘this was 1969!’. The process of automating the retro-synthetic analysis of an organic molecule is hard. Like a game of chess once a few moves have been made the possible combinations multiply.
One important facet of the design of this software was the continued input of the chemist to ‘guide’ the algorithm through the process – this also saves computational effect and memory. The descriptions of a chemical structure drawing package***, mapping the structure into a graph, and perceiving the chemical functionality are still relevant today, and the methods, for the most part, unchanged. The description of the SSSR (Smallest Set of Smallest Ring problem) is in here too!
We picked out Figure 2, one of many describing the algorithm, because it starts with ‘Chemists enters target molecule’ and ends with ‘Chemist satisfied?’. The essential principles of devising a computer program to design synthetic routes is still current. The approach taken in the program is supervised; the chemical disconnections / forward reactions need to be encoded and as such it is hard to keep it up-to-date and modern.
Computer-Assisted Design of Complex Organic Syntheses
Science 10 Oct 1969: Vol. 166, Issue 3902, pp. 178-192
A mention must be made to the current approaches using Deep Learning to effectively extract the chemical reactions from the data and provide the body of knowledge in an unsupervised and updatable manner.
Planning chemical syntheses with deep neural networks and symbolic AI
Nature 2018, 555(7698), 604-610.
** – of additional note: The modern availability of journals on-line means we miss that trip to library and the discovery of ‘the paper after the one we were looking for’ moment. For this paper (when you download the pdf) just read the start of the following paper…. still relevant today….
*** – and they basically look the same today…
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 96/100: Automating Structure-Activity Evaluation
The main principles of the drug discovery process rely on a correlation existing between a chemical structure and it’s activity. This #BucketListPapers integrates two methods of finding these. Several different promising methods have arose that attempt to find the relationship between structure and biological activity. These are quantitative structure-activity relationship (QSAR), pattern recognition methods (PR) and discriminant analysis (DA). Klopmann introduced a new method that combines the latter two. The method fragments a molecule into varying size fragments, from 3 to 12 interconnecting atoms. The hydrogens are retained along with information about the multiplicity of the bonds and whether an atom is connected to a terminal functionality group. Statistical analysis is then performed on each of the fragments to indicate whether it is an important feature in an active or inactive molecule, where the active fragments are known as biophores and the inactive fragments are biophobes.
Three studies were carried out in order to understand how useful this method was. Two of the studies already had existing information and the biophores and biophobes that were extracted were reasonable and aligned with previous studies. Additionally, the models correctly identifying 35/43 and 41/43 respectively. The image below demonstrates three fragments, two activating and on deactivating. The two activating fragments show a bay region that had previously, in a different study, been considered to be important in the binding, which the deactivating fragment does not possess. The last study there was no known information about what caused molecules to be active or inactive. There were 10 identified statistically relevant biophores/ biophobes. When using these biophores and biophobes the model was able to correctly predict 33/39.
Klopmann, therefore, showed that this approach had the potential to automatically extract information and turn them into useful biophores and biophobes. This method paved the way in applying artificial intelligence approaches to structure-activity studies.
Klopmann: Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules
J. Am. Chem. Soc. 1984, 106, 24, 7315–7321
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 95/100: Automated workflow for multiobjective de novo design
De novo drug design involves the enumeration of new molecules with the aim of satisfying multiple activity and pharmacokinetic objectives. The combination of multiparameter optimisation, along with the vast area of chemical search space, makes enumeration and ranking a significant challenge for any de novo design project. This #BucketListPapers introduces MOARF (MultiObjective Automated Replacement of Fragments), a workflow for the multiobjective design of synthetically-accessible small molecules. Specifically, the protocol involves the implementation of SynDiR, a fragmentation algorithm that uses retrosynthetic rule-based cuts, to an input molecule. Replacement fragments are then identified from a large dataset of predefined fragments, generated by using SynDiR on molecule libraries, based on their physicochemical properties, similarity to the original fragment and alignment using Rapid Alignment Searching (RATS). The reconstructed molecules are then filtered and scored using a combined scoring function assembled from a range of user-defined functions, such as activity prediction and shape similarity models.
Importantly, the MOARF workflow was tested on a historical project from the Institute of Cancer Research (ICR), in which a CDK2 inhibitor was optimised for potency and metabolic stability. The results showed that MOARF generated a diverse dataset of suggestions within relevant areas of chemical space, and optimising against shape similarity (using ROCS), atom pair similarity, CDK2 activity prediction and CLogP resulted in the synthesis of several molecules with CDK2 activity and improved metabolic stability. This demonstrates the value of using computational enumeration and scoring methods to guide de novo design and multiparameter optimisation.
MOARF, an Integrated Workflow for Multiobjective Optimization: Implementation, Synthesis, and Biological Evaluation
BucketListPapers 94/100 It is more than physical properties alone…
We learn at degree level that the enantiomeric compounds have properties which are exactly the same, but for measurements like polarised light. However, when enantiomers are measured in common assays used in drug hunting, what effects to we see, and by how much do they vary? In this #BucketListPapers, Leach et al looked at enantiomeric matched pairs and found that, for the likes of aqueous solubility, that there was no difference, as expected, but for some measured properties, like Cytochrome P450 inhibition there were significant differences. Perhaps most surprisingly were the absorption measurements, these showed no difference for both A-B, and for efflux ratio.
Enantiomeric pairs reveal that key medicinal chemistry parameters vary more than simple physical property based models can explain.
Med. Chem. Commun., 2012, 3, 528
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 93/100: More Matched Pairs Papers!
A critical part of Matched Molecular Pair Analysis is the includsion of the chemical environment information with the chemical change. Scientists at GSK were the first to publish this finding and quantify the importance. Hence this is second selection for a #BucketListPapers on MMPA.
Lead Optimization Using Matched Molecular Pairs: Inclusion of Contextual Information for Enhanced Prediction of hERG Inhibition, Solubility, and Lipophilicity
J. Chem. Inf. Model. 2010, 50, 1872–1886
More honourable mentions:
In 2006, Abbott Laboratories published Drug-Guru, a medicinal chemistry “expert system” that contained a collection of molecular transformations compiled from the literature and medicinal chemists’ experience. When this was published, we had started work on a similar idea but using MMPA to ‘find’ all of the significant molecular transformations for med chem, thus saving the bother of manual cataloguing, and thus operating in an un-biased way.
A computer software program for drug design using medicinal chemistry rules.
Bioorg. Med. Chem. 2006, 14, 7011-7022.
For the statistical analysis of MMPA and the establishment of ‘med chem rules’ see:
Turbocharging Matched Molecular Pair Analysis: Optimizing the Identification and Analysis of Pairs.
J. Chem. Inf. Model. 2017, 57, 2424−2436
Coming back to the good people at GSK and Sheffield Uni, recently they examined the output drug design algorithms, compared to human output… an excellent read,..
A Turing test for molecular generators
J. Med. Chem. 2020, 63, 20, 11964–11971
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 92/100: Here Come the Matched Pair Papers!
Presenters at medicinal chemistry conference now use the phrases ‘matched pairs’ and ‘matched pair analysis’, showing the technique has become a ‘must-have’ part of the analysis toolbox. Choosing which matched pair papers to feature in #BucketListPapers was surprisingly difficult, given this is our area of expertise. We have chosen one of the first papers in the area:
Matched Molecular Pairs as a Guide in the Optimization of Pharmaceutical Properties; a Study of Aqueous Solubility, Plasma Protein Binding and Oral Exposure
J. Med. Chem., 2006, 49 (23), 6672-6682
For a review of the techniques and the literature, a book chapter is now available:
Matched Molecular Pair Analysis. Andrew Leach
Comprehensive Medicinal Chemistry III, Volume 3 page 221.
The field has advanced due to efficient and reproducible algorithms being published. Here are two key references:
Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets
J. Chem. Inf. Model. 2010, 50, 339–348
WizePairZ: A Novel Algorithm to Identify, Encode, and Exploit Matched Molecular
Pairs with Unspecified Cores in Medicinal Chemistry
J. Chem. Inf. Model. 2010, 50, 1350–1357
Matched Molecular Pair Analysis can be used to share knowledge between organisations without sharing the original chemical structures and data.
Learning Medicinal Chemistry Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) Rules from Cross-Company Matched Molecular Pairs Analysis (MMPA)
J. Med. Chem. 2018, 61, 3277−3292
..and it can be combined with Machine Learning…
Coupling Matched Molecular Pairs with Machine Learning for Virtual Compound Optimization.
J. Chem. Inf. Model. 2017, 57, 12, 3079–3085
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 91/100 – How to Assess Uncertainty in a Given Data Set.
As the abstract says “The maximum achievable accuracy of in silico models depends on the quality of the experimental data.” The paper describes beautifully, methods to examine a given dataset and so “defines a natural upper limit to the predictive performance possible.” This #BucketListPapers uses publicly available data from ChEMBL and Figure 5 illustrates the spread in the data for pairs for measurements for a given molecule – the 2.5 log lines really show how much measurements can vary. An appreciation of this degree in variation is important when building ML models for drug discovery.
Kramer : The Experimental Uncertainty of Heterogeneous Public Ki Data.
J. Med. Chem. 2012, 55, 5165-5173
A great follow up, by the same author, is the impact of uncertainty on Matched Molecular Pair Analyis.
Matched Molecular Pair Analysis: Significance and the Impact of Experimental Uncertainty.
J.Med. Chem. 2014, 57, 9, 3786–3802.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 90/100 – Identification of pharmacophores and who is the best?
This bucket list paper draws comparisons between three commercially available pharmacophore generation programs. All three programs are extensively described. The programs are then compared on their ability to identify the target pharmacophore which was a known pharmacophore, from Relibase+ and the literature, for five different proteins that had a vast collection of crystallographic data. The pharmacophoric features that were of interest include hydrogen-bond acceptors and donors, positive and negatively charged groups and hydrophobic centres. Two separate evaluations were performed, the first was using a rigid search and the second a flexible search. It is difficult to evaluate how good each of this methods were at generating the target pharmacophores, therefore, two criteria were used to judge. Firstly, the RMSD between the found pharmacophores and the target pharmacophore was calculated. The second method was to identify the number of “misses” that occurred, either a feature was missing or a wrong functional group was assigned to the pharmacophore. The generate pharmacophores can then be scored and ranked according to these two criteria. For all five datasets each of the results generated by each method are discussed in detail. These results indicated that Catalyst and GASP outperformed DISCO for all five datasets. GASP ranked first in three out of five cases and Catalyst ranked first in the other two cases. Both of these methodologies identified the target pharmacophore quickly as they were found within the first ten solutions.
A comparison of the pharmacophore identification programs:
Catalyst, DISCO and GASP. Journal of Computer-Aided Molecular Design, 16 (8-9). 653 – 681.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 89/100: How do we find binding pockets in proteins?
Goodford describes, in 1985, a method for probing proteins computationally to find putative binding sites. For budding computational chemists this is a must read. For medicinal chemists, the understanding of the techniques and an ability to evaluate the results is highly useful. The method applies a grid over the protein and then probes with ‘spheres’ that represent a small number of chemical groups [NH3+, =O, O-, OH, CH3, H2O]. Using a consideration of the molecular physics, Lennard-Jones potentials and electrostatic interactions were calculated and a model for hydrogen bonding applied for each of these groups, hence the energy of binding is estimated. The paper has several examples of the algorithm probing phospholipase-A2 structures.
We think every methodology paper should start a ‘Discussion’ session with the phrase – “Before assessing the findings, it is necessary to consider the shortcomings of the method.” It is worth reading the paper for this section alone and the detailed self-assessment of the method.
Methods for determining binding sites within proteins is an area of continuous development and exciting for AI drug discovery. The combination of protein folding algorithms, binding pocket determination and directed screening to select small sets of compounds to screen offers a genuine opportunity to reduce the cost and time to drug discovery research.
A computational procedure for determining energetically favorable binding sites on biologically important macromolecules
J. Med. Chem. 1985, 28, 7, 849–857
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 88/100 : Energising QSAR Models
Comparative Molecular Field Analysis (CoMFA). 1. Effect of Shape on Binding of Steroids to Carrier Proteins.
J. Am. Chem. Soc. 1988, 110, 5959-5967.
Comparative molecular field analysis (CoMFA) is a method that was created for three-dimensional quantitative structure analysis (3D-QSAR). The work Cramer et al. did was an extension of work that had been done several years prior.1 The CoMFA analysis is performed by placing molecules into on 10-degree grid and the minimising the structures to aligning structures within the lattice. A QSAR table is then generated by calculating energy at several points within the grid. This energy value incorporates electrostatic and steric properties of the molecule. These energies can then be used as descriptors in QSAR models. The CoMFA method was then applied to two sets of steroid structures, the first contains 21 molecules and the latter contains 10 molecules, the descriptors were for both were used in a partial least squares (PLS) model. The results highlighted that the CoMFA method as a promising QSAR method.
- Wise, M., Cramer, R.D., Smith, D. and Exman, I. (1983) Progress in three-dimensional drug design: the use of realtime colour graphics and computer postulation of bioactive molecules in dylomms. In Quantitive Approaches to Drug Design (Proceedings of the 4th European Symposium on Chemical Structure-Biological Activity: Quantitative Approaches) pp. 145-146. Elsevier, Amsterdam
BucketListPapers 87/100: The Ramachandran Plot
A protein’s function is inherently linked to its secondary and tertiary structure, which is determined by its amino acid sequence. Understanding the structures of various disease-linked proteins is, therefore, a crucial step towards designing modulators that can influence protein function. When considering the conformational propensity of a polypeptide, given that the CO-NH peptide bond atoms have a very stable planar conformation, it is apparent that the secondary structure -determining coordinates are the two backbone dihedral angles that occur between atoms CO-NH-C(α)-CO (Φ) and NH-C(α)-CO-NH (Ψ). This idea was first introduced by Ramachandran et al. in the 1960s, who hypothesised that only certain regions of Φ / Ψ space are energetically feasible given the minimum contact distances permitted by the van der Waals interactions between atoms in adjacent residues (https://doi.org/10.1016/S0022-2836(63)80023-6). Indeed, when Ramachandran et al. analysed structural data from di-, tri- and poly-peptides, they found that all of the structures resided within the theoretically permitted regions of Φ / Ψ space, and that the space occupied was characteristic of the secondary structure present in the polypeptide chain.
Stereochemistry of polypeptide chain configurations.
Today, Ramachandran analysis remains a staple for protein structure validation and secondary structure analysis/ parameterisation of molecular models and simulations:
Structural biology data archiving – where we are and what lies ahead.
FEBS Letters. 2018. 592(12). 2153-2167.
A New Generation of Crystallographic Validation Tools for the Protein Data Bank.
Structure. 2011. 19(10). 1395-1412.
Combining Ramachandran plot and molecular dynamics simulation for structural-based variant classification: Using TP53 variants as model.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 86/100: A Step Closer to Unlocking the Potential of FEP for Drug Discovery
Free energy perturbation (FEP), a molecular dynamics (MD) simulation technique that involves transforming a model system between two states to calculate the free energy difference between the states, has been well used and developed within academia for decades. In the context of drug discovery, this methodology has promised the ability to calculate relative protein binding energies of similar compounds (e.g matched molecular pairs) by performing alchemical transformation simulations between the compounds in solution and embedded in the protein pocket. However, the challenges that are associated with many MD simulation methodologies also apply here and have prevented the efficient use of FEP within industrial projects to guide lead optimisation. Namely, these challenges include the limited accuracy of molecular mechanics force fields, the computational cost involved in sampling the conformational and configurational landscape of the system and the technical barrier for users to implement specialist MD methods.
In this publication by Wang et al., researchers at Schrodinger present their extensive work towards resolving some of these issues. Importantly, the paper highlights efforts to generate a more transferable small molecule force field (OPLS2.1) that is parameterised using 10,000 drug-like compounds, an implementation of a sophisticated enhanced sampling approach (replica exchange with solute tempering (REST)) within the FEP simulation protocol, and the automation of missing parameter calculations and simulation setups, making the approach accessible for industrial medicinal chemists. The results presented show a significant improvement in force field accuracy compared to a previous OPLS model and the commonly used MMFF force field, along with promising false negative and true positive rates when the method was tested on a retrospective project, and a 6-fold enrichment in compound selection for synthesis within a prospective drug discovery project. The ability to improve the accuracy of relative binding affinity predictions and the usability of FEP technologies through user-friendly interfaces brings us a step closer to allowing industrial medicinal chemists to fully utilise FEP to benefit their projects.
Accurate and Reliable Prediction of Relative Ligand Binding Potency in Prospective Drug Discovery by Way of a Modern Free-Energy Calculation Protocol and Force Field.
BucketListPapers 85/100: Amber: The force (field) is strong with this one!
Following on from our previous bucket list release on the CHARMM MD package and force fields, we now choose to introduce Amber, another MD simulation package and force field family that are frequently employed for biomolecular simulations in drug discovery. Like CHARMM, the Amber MD software originated in the late 1970s and has been maintained and expanded throughout the years. Following over a decade of model refinement and optimisation, the modern family of Amber force fields began with the development of the Cornell et al. force field in 1995 (commonly referred to as ff94), with the potential energy equation including bonded and non-bonded terms and parameters for amino acids and nucleic acids. Importantly, Cornell et al. established a general parameterisation procedure to ensure continuity and additivity for future iterations of molecule parameterisation. The procedure involves the derivation of partial atomic charges from an electrostatic potential, determined using QM calculations with the 6-31G* basis set, followed by two-stage restrained electrostatic potential (RESP) fitting. Since ff94, many refinements, corrections and additions to the Amber force field have been published, including ff14SB and LIPID14, that include parameters for amino acids, nucleic acids, carbohydrates and lipids. There is also a general amber force field (GAFF) that, like CGenFF, allows anyone to parameterise small molecules in a consistent way to the biomolecular force fields, allowing the interactions between drug molecules and their targets to be explored using simulation.
The Amber biomolecular simulation programs.
J Comp Chem. 2005. 26(16) 1668-1688
A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules.
JACS. 1995. 117(19). 5179-5197.
ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SB.
J. Chem. Theory Comput. 2015. 11(8). 3696-3713.
Lipid14: The Amber Lipid Force Field.
J. Chem. Theory Comput. 2014. 10(2). 865-879.
Development and testing of a general amber force field.
J. Comput. Chem. 2004. 25(9). 1157-1174.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 84/100: CHARMM: may the force (field) be with you!
First conceived in the 1950s and advanced through the latter half of the 20th century, molecular dynamics simulations have become a key branch of computer aided drug discovery. The ability to calculate the evolution of atomic positions through time, based on an underlying molecular mechanic energy function, brought with it the potential to explore dynamic ensembles of biomolecular drug targets, and to gain mechanistic understanding of drug interactions at an atomic level. Significant contributions to the field have been made by Karplus and MacKerell, who have been responsible for the development of the popular CHARMM MD software and force fields (the potential energy function and parameters used as the underlying model) (https://doi.org/10.1002/jcc.540040211). The first CHARMM force field (CHARMM19) was developed in the 1980s and used an united-atom representation (where implicit hydrogens were treated as part of the attached heavy atom) (https://doi.org/10.1063/1.472061). Since then, multiple re-parameterisations, corrections and additions, where parameters were fitted to experimental structure and ab inito data, have been released (https://doi.org/10.1021/jp973084f). Today, the CHARMM force fields utilises an all-atom model that incorporates bonded and non-bonded potential energy terms and parameters for the basic biochemical building blocks; e.g the standard amino acids, phospholipids, and nucleic acids (https://doi.org/10.1038/nmeth.4067). In addition, the CHARMM general force field (CGenFF) allows users to fit parameters to a wide range of small molecules based on the chemical groups and atom types present in the structure, which crucially allows researchers to model the interactions of small molecules with biomolecular targets (https://doi.org/10.1002/jcc.21367).
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations
Journal of Computational Chemistry, 1983, 4, 187-217.
Simulation of activation free energies in molecular systems
J. Chem. Phys. 105, 1902 (1996)
CHARMM36m: an improved force field for folded and intrinsically disordered proteins
Nature Methods volume 14, 2016, pages 71–73
CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields
Journal of Computational Chemistry, 2010, 31, 671-690.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 83/100: The amazing Ames test – avoiding mutagens
For a long time, I had assumed that the AMES test was an abbreviation and that it was one of the most brilliant transformations of a surprising insight into something useful. I was delighted and envious when I discovered that this clever trick was actually named after its inventor, Bruce Ames. In this test for mutagenicity, a range of bacterial strains that have naturally occurring mutations preventing them from synthesising a specific amino acid (usually histidine) are grown in media lacking that amino acid. Exposure to certain types of mutagen cause reversion to a form that is able to synthesise the amino acid and hence the bacteria can grow and reproduce leading to the appearance of colonies on the plates (Figure). The details of the test, including some of its history, are described in this article by Mortelmans and Zeiger. The paper is a veritable manual for those wishing to run the test with troubleshooting and detailed instructions for all procedures provided. One of the key challenges is that many compounds are not mutagenic until they are metabolised and so the test requires the inclusion of some form of metabolic system and the choice of this can be critical. There remains much debate about the reproducibility and relevance of the Ames test but it has been a crucial defence in pre-clinical screening.
The Salmonella (Ames) test for mutagenicity
K Mortelmans, E Zeiger
Current protocols in toxicology Curr Protoc Toxicol. 2001 May;Chapter 3:Unit3.1.
Among many others, I’ve shown how results in the Ames test for aromatic amines (a common mutagen class) can be correlated with simple chemical descriptors and that this can be used to design safer compounds (doi: 10.1021/jm3001295), McCarren and colleagues and Novartis have shown that this approach likely has a dependence on the size of the molecule (doi: 10.1016/j.bmc.2011.03.066) and Shamovsky and colleagues at Astrazeneca (doi: 10.1021/acs.jmedchem.1c00514) have probed the metabolic process and this may relate to the molecular weight limit already mentioned. The beauty of this area is that unlike much that is utterly mysterious about biology when looked at by chemists, mutagenicity often has a strong chemical component that be rationalised and predicted.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 82/100: Hydrogen Bonds – not all the same!
Although Lipinski’s rules and related approaches demand that we count any hydrogen bond donor or acceptor (defined in a rather limited way) as equivalent, there is a significant strand of experimental and computational work that has sought to quantify the differences between the acceptor and donor ability of different functional groups. I was first introduced to these ideas in one of the series Hydrogen Bonding, Parts 1 to at least 9. In part 9, Abraham et al. introduced an overview of their measurements made by using a standard donor (4-nitrophenol) or a standard acceptor (N-methylpyrollidinone) in 1,1,1-trichloroethane. A large array of functional group types were studied. Subsequently, these data were found to correlate well with electrostatic potential computed in quantum mechanical calculations by Kenny in two important publications (Acceptors: and Donors:). This work was extended to show how hydrogen bonding strength could be related to differences between partitioning coefficients measured in 1-octanol and partitioning coefficients measured in hexadecane.
In parallel with this activity, mostly undertaken at Astrazeneca, Hunter was using similar considerations to build a unifying model of the nature of interactions between molecules that led to the map shown below in which blue regions represent acceptors (across the top) and donors (down the side) that are able to form attractive interactions in water. The red regions represent those interaction types that involve loss of so much solvation by the more polar partner that the new interaction is unable to compensate for it.
Quantifying Intermolecular Interactions: Guidelines for the Molecular Recognition Toolbox
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 81/100: Pair up your atoms….
The concept of using pairs of atoms and how far apart they are as a way to encode structures underpins many of the grouping techniques that are particularly popular in HTS analysis and other large collections of biochemical data. Here, Carhart, Smith and Venkataraghavan introduce atom pairs and describe how the shortest path between two atoms can be identified by crawling through a molecule starting at each atom in turn to identify the shortest distance to every other atom. This way to encode the structure of molecules is illustrated for acetone and isobutylene and allows for a very compact and general representation that allows connections between molecules to be made that are usefully surprising.
Atom pairs as molecular features in structure-activity studies: definition and applications
J. Chem. Inf. Comput. Sci. 1985, 25, 2, 64–73
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 80/100: Polar exploration complements non-polar exploration
By investigating the correlation of a variety of group descriptors, Craig established that there are a group of non-polar descriptors such as π, molecular volume, parachor, ES and group refractivity that all correlate to a significant extent. By contrast, the various polar descriptors considered did not show appreciable correlations either with the non-polar set or among themselves (confirming earlier findings). By establishing independence of these two types of descriptor, it follows that choosing groups that span a range of π and s would be an effective way to explore SAR. The distribution of substituents by values appropriate for the para position are shown and selection of groups from each quadrant should be considered.
Interdependence between physical parameters and selection of substituent groups for correlation studies. Craig, Paul N.
J. Med. Chem. 1971, 14, 680–684.
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 79/100: Designing CNS penetrant drugs….maybe the hardest thing?
Perhaps the hardest medicinal chemistry challenge is taking a series of compounds, and modifying them, so they pass through the blood brain barrier. Such drug discovery projects have enlarged testing cascades, and usually require an early in-vitro absorption assay, and secondary rodent pharmacokinetics experiments to measure concentration of drug in the brain compartment. As such the cycle times around the Design-Make-Test-Analyse are protracted. The need for clear understanding of the processes involved, and the properties required in molecules, to reduce the number of cycles is paramount. We selected the review “Demystifying brain penetration…” as this gives a broad summary of the understanding so far.
Demystifying brain penetration in central nervous system drug discovery.
In recent years the “CNS MPO” score, which can be calculated at the point of design, has gained popularity, so it is worth referencing this paper too.
Moving beyond Rules: The Development of a Central Nervous System Multiparameter Optimization (CNS MPO) Approach To Enable Alignment of Druglike Properties
ACS Chem. Neurosci. 2010, 1, 435–449
However, the CNS MPO score might become dated, as further understanding of the transporters involve rises. It might be that the count of number of Hydrogen Bond Donor (HBD) might be the key predictor.
How hydrogen bonds impact P-glycoprotein transport and permeability.
Bioorg. Med. Chem. Lett. 2012, 22, 6540−6548.
And lastly a couple of medchem papers on series that were optimised for CNS penetration:
The design and identification of brain penetrant inhibitors of phosphoinositide 3- kinase α.
J. Med. Chem. 2012, 55, 8007−8020.
Optimization of Brain Penetrant 11β-Hydroxysteroid Dehydrogenase Type I Inhibitors and in Vivo Testing in Diet-Induced Obese Mice
J. Med. Chem. 2014, 57, 970−986
A couple of more papers suggested by Pete Kenny.
CNS Drug Design: Balancing Physicochemical Properties for Optimal Brain Exposure
J. Med. Chem. 2015, 58, 2584–2608
Structural Modifications that Alter the P-Glycoprotein Efflux Properties of Compounds
J. Med. Chem. 2012, 55, 4877–4895
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 78/100: Improving on existing drugs, covalent binders and the late LO medicinal chemistry
A successful drug in the clinic is the final proof that modulation of a protein target yields benefits. The drug in question may have limitations, such as side-effects from off-target toxicity, dose limited efficacy and the such like. As such drug discovery program ‘restart’ with the aim of making a ‘gold-standard’ treatment, armed with a wealth of data and understanding from the clinic. These programs can be difficult and have protracted testing cascades as a large quantity of measurements are required to demonstrate superiority.
This paper is one of host of recent examples with the aim optimising for a specific mutation of the protein target and off-target effects. The selectivity is gained using the covalent binding group and off-target selectivity (IGFR) is achieved by a combination of groups, shown by some excellent matched pair analysis. This in itself is great medicinal chemistry, but the paper also describes some of the trials and tribulations of late LO and early pre-clinical, including a small amount of human PK data.
Discovery of a Potent and Selective EGFR Inhibitor (AZD9291) of Both Sensitizing and T790M Resistance Mutations That Spares the Wild Type Form of the Receptor
J. Med. Chem. 2014, 57, 20, 8249–8267
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 77/100: The first tyrosine kinase inhibitor…
Perhaps this paper is showing its age, but at the time BMCL papers were basic, with just a table of two of measurements and simple synthesis. Non the less, this is the paper describing the discovery of a candidate drug (1) that went on to become the product Gleevec (Imatinib); this first tyrosine kinase inhibitor (the Wikipedia page has more data). The drug made quite an impact on many patients lives. Looking back, with experienced eyes, we can see the structural features now present in ATP binding pocket kinase inhibitors. The research process was rational and screened multiple kinases, leading by careful SAR analysis to 1. Tne compound, which is quite selective, in part due to the methyl group on the central ring. Of note is the ‘pan’ kinase inhibitor 9 – this compound might be worth adding to your compound collections, such that early screening could yield a tool compound?
Imatinib – Potent and selective inhibitors of the Abl-kinase: phenylamino-pyrimidine (PAP) derivatives
Bioorg. & Med. Chem. Lett. 1997, 7, 187-192
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe
BucketListPapers 76/100: Macrocyclisation as the route to success?
The story of Lorlatinib started with an existing inihibitor and used macrocyclization to improve properties. Macrocyclisation has been a recent approach to improving small molecules, and is usually performed as part of a structure-based design program. The ideal starting point is a high quality x-ray structure determination of the small molecule bound to protein, thats shows the molecule is folded, in such a way, that groups in the molecules are close enough together to ‘join up’, thus forming a large ring. The difference in entropy on the binding can be significant and can yield highly potent compounds. In addition, these macrocycles present different shapes and have been demonstrated to have improved absorption and reduced efflux. Although the approved drug Crizotinib has robust efficacy in ALK-positive tumors, patients eventually develop resistance, and metastases to the brain can occur. Many cancer drugs have poor penetration across the Blood Brain Barrier (BBB), so a new suite of kinase inhibitors with high brain blood concentration is highly desirable. The authors in this paper started from Crizothinib and initially made acyclic compounds, focussing on efficient small molecules (low LipE) to improve brain penetration (see references within). Co-crystallisation of these inhibitors in ALK showed the compounds were folded in the U-shape, positioning two aryl groups close together (see Figure 3). The idea for macrocyclic compounds was born, but this was not without difficulties. The paper describes in great detail the medicinal chemistry and synthetic chemistry challenges and is rich in data and analysis.
Lorlatinib.
Discovery of (10R)-7-amino-12-fluoro-2,10,16-trimethyl-15-oxo-10,15,16,17-tetrahydro-2H-8,4-(metheno)pyrazolo[4,3-h][2,5,11]-benzoxadiazacyclotetradecine-3-carbonitrile (PF-06463922), a macrocyclic inhibitor of anaplastic lymphoma kinase (ALK) and c-ros oncogene 1 (ROS1) with preclinical brain exposure and broad-spectrum potency against ALK-resistant mutations.
J. Med. Chem. 2014, 57, 11, 4720–4744
#BucketListPapers #DrugDiscovery #MedicinalChemistry #BeTheBestChemistYouCanBe