Much of medicinal chemistry is based on the optimisation or reduction of interactions between a small molecule and a variety of biomolecules, this can be increasing the affinity of a ligand for a receptor or reducing affinity for some undesired off-target interaction such as HERG or CYP450. Whilst the overall physicochemical properties of the molecule can have a major influence it likely that specificity might be driven by optimisation of strength and geometry of specific molecular interactions.
Strength of interactions
Whilst the strength of a covalent single bond is usually in the region 80-100 Kcal/mol the non-covalent interactions exploited by medicinal chemists are much weaker. Andrews has tried to estimate the average strength of various molecular interactions by examining the structural components and binding affinities of 200 compounds. Other have tried to estimate the strength of interaction by using chemical double mutants.
Typical Energies Salt Bridge ~2 kcal/mol H-Bond ~1 kcal/mol Hydrophobic ~0.7 kcal/mol Aromatic ~1-3 kcal/mol
Since dG=-2.303RTlogK we can calculate that a single ionic interaction might afford a 25-fold increase in affinity, whilst a hydrogen bond yield a 6-fold increase, 3.5-fold increase in binding constant for a methyl group. However it is important to note that steric clashes can have a much more pronounced impact on affinity, the interaction between two atoms is described by the Lennard-Jones potential shown graphically below.
As you can see the attractive forces predominate when the atoms are further apart but when they get too close the repulsive forces become dramatically dominant. Hence a small steric clash can cause the loss of all affinity.
With many of these interactions simply looking at a model of a protein with a ligand docked can be misleading, only considering the bound state only gives part of the equation. Desolvation plays a very important role, particularly with ionised or polar functional groups where there will be a large unfavourable desolvation term, in addition entropic changes can also have a significant influence.
A detailed analysis of the molecular interactions present found between ligands and macromolecules has been undertaken, "A systematic analysis of atomic protein–ligand interactions in the PDB" DOI looking at over 11,000 complexes the authors were able to categorise the 7 most common types of interaction Fig 1. (Image curtesy of Matthieu Schapira).
The most common interactions being hydrophobic, hydrogen bonds and pi-stacking.
For many biogenic amines a key requirement for ligand recognition is the interaction between a protonated amine in the ligand and a specific aspartic acid residue buried in the membrane domain of a GPCR. Kumar has shown that the 3D structure of many proteins is stabilised by internal salt bridges, many buried within the core of the protein. In many cases a salt bridge is really a combination of hydrogen bonding and an ionic interaction. Estimates for the strength of such interactions can be misleading since there is a large unfavourable desolvation term and for mobile residues on the surface of proteins there may be a significant entropic loss. However in the case of salt bridges buried within the core of the protein there are many examples where they have been shown to be highly stabilising. In general however as Fig 1 shows this is not a particularly common interaction, however charged functionality may improve physicochemical properties of the molecule.
One of the most stabilising salt bridges is formed by residues Glu27 and Arg387 in the human salivary a-amylase (PDB code: 1smd), shown in the image below. Whilst there is a significant desolvation penalty, the side chain centroids are 3.7 Angstroms apart, and the measured hydrogen bond length is only 1.94 Angstroms. The side-chain ionised carboxylic acid oxygen atoms of Glu27 form two very strong hydrogen bonds with the side-chain protonated nitrogen atoms in the guanidinium group of Arg387, with the additional possibility of extra weaker cross h-bonds increasing stability. dG for this salt bridge has been calculated to be -22.4 kcal/mol.
Glu27 and Arg387 in the human salivary a-amylase
The Hydrogen bond is a ubiquitous element of the recognition in biological systems. Estimation of the energetic contribution from hydrogen bonds is sometimes problematic, since in aqueous solution formation of a hydrogen bond requires the desolvation of both donor and acceptor. However current estimates range from 1-40 kcal/mol. The higher energy hydrogen bonds occur in the case of charge reinforced hydrogen bonds (see above). The length of the Hydrogen bond ranges from 2.6 Angstroms to 3.1 Angstroms based on observation from the PDB. (Structure in Protein Chemistry, Jack Kyte)
|Table of Length of Hydrogen Bonds|
|A-H....B||Functional Group||Avg Bond length
The geometry of these interactions has been studied and the preferred geometries identified for a number of functional groups (Journal of Computer-Aided Molecular Design, 10 (1996) 607-622) based on data taken from CSD.
Hydrogen-bonding properties have often been correlated with pKa data, and within a chemical series e.g substituted pyridines this may well be true but this is not true when comparing unrelated functional groups, For example, thiols are much more acidic than alcohols, because they can stabilize the charged anionic state formed on deprotonation, but they are much worse hydrogen-bond donors, because they are less polar.
Intramolecular hydrogen bonding can be important for stabilizing a particular ligand conformation, Stahl et al (J. Med. Chem. 2010, 53, 2601–2611) DOI: 10.1021/jm100087s , have completed an extensive search of the CSD and shown the highest frequency of intramolecular hydrogen bonds have planar, six-membered rings stabilized by conjugation with a π-system. A variety of further, far less explored topologies have been identified: Weaker six-membered ring hydrogen bonds containing one sp3 center and, in particular, a number of nonplanar seven-membered and eight-membered ring topologies. Five-membered ring intramolecular hydrogen bonds have the smallest angles and the longest H-bond distances. With N-H as the proton donor, C=O or heterocyclic N appear to be the preferred acceptors.
An example is shown below. 1FN
Whilst much of the crystallographic work has concentrated on the influence that hydrogen bonding has on the structure of proteins a recent paper by Selwood et al has presented an analysis of the chemical fragments that hydrogen bond to Asp, Glu, Arg and His side chains in proteins.
As one might expect the fragments found to form two hydrogen bonds to the carboxylic acid of Asp and Glu include arylamidines, guanidines and 2-amino azoheterocycles such as 2-aminopyridine. Interestingly the non-basic 1,2-cyclic diols are also common double hydrogen bond forming fragments. A wide range of fragments can form a single hydrogen bond with Asp or Glu including amines, heterocycles, sulphonamides, hydroxyls and carboxylic acids.
Arg has a pKa of 12 and is positively charged at physiological pH, therefore, Arg is entirely a hydrogen bond donor in protein ligand interactions. The most common functional groups to interact with Arg have phosphates, phosphonate, and carboxylic acids. The vast majority of the other ligands bond via an O-mediated hydrogen bond, this includes sulphonyls, ketones, phenols and ketones.
In contrast to the other side chains His with a pKa of 6 can interact in either protonated or unprotonated form. It can coordinate with many metals. Whilst acidic groups again form a proportion of the observed ligands azaheterocycles, sulphonamides, hydroxyls, and carbonyls were also observed. Full details of the fragments are available in the supporting information.
An interesting paper has showed that Protein-ligand interfaces are polarized: Discovery of a strong trend for intermolecular hydrogen bonds to favour donors on the protein side with implications for predicting and designing ligand complexes doi. As shown in the plot below, ligands usually act as a H-bond acceptor.
Adapted from Raschka et al. 2018; source: https://www.biorxiv.org/content/early/2018/02/05/260612
Interactions between aromatics rings are well documented, 60% of the aromatic residues are involved in aryl-aryl interactions. The T-shaped edge-to-face and the parallel-displaced stacking arrangement predominate. Whilst in proteins the parallel-displaced stacking arrangement is observed more often the two arrangements are of similar energy -1.6 to -2.4 kcal/mol, and is thought to be a combination of VDW dispersion and electrostatics (A word of caution, the precise nature of the interaction is poorly understood and may not be modeled accurately by molecular modeling packages). The strength of the interaction can be influenced by substitution, Stacking arrangements of an electron-poor and an electron-rich aromatic ring profit from charge transfer. Stacking between electron-deficient rings is generally preferred over stacking of electron-rich ones. In our work on NK1 antagonists the X-ray structures (shown below) illustrate how the differences in the pi-stacking interactions depend on the nature of the aromatic rings.
The influence of the edge to face pi-stacking interaction can be seen in the NMR, comparing the parent benzyl alcohol with the benzyl ether (below). It is apparent that in the benzyl ether the two ortho protons (red) are shifted upfield due to the shielding effect of one of the aromatic rings of the benzhydryl group.
In the X-ray cocrystal structure of human sEH (PDB: 3I1Y), the phenyl ring of the ligand (green) is positioned to allow face to face π-stacking interaction with H524, whilst the other phenyl ring edge to face π-stacks with Tyr383, in addition the pyridyl ring lies of Trp336.
Whilst there are a number of amino acids bearing aromatic side-chains Phe and Tyr predominate when it comes to π-stacking interactions.
In addition to the interaction between two aryl systems there is also a favourable interaction between aryl systems and the π-face of protein amide backbones. This type of interaction has been modelled for a extensive range of 5- and 6-membered heterocycles DOI. In general, the heterocycle is oriented such that the heterocycle dipole moment and amide dipole moment are close to antiparallel.
Dougherty, (JACS, 2000, 122, 870) Has undertaken a high level theoretical comparison of Cation-Aryl with salt bridges in a range of solvents and found the cation-aryl interaction is strong across a range of solvents including polar media, which is not found salt bridges. Part of the reason for the difference is that ionised groups must be desolvated in aqueous solution whereas aromatic rings are already desolvated to a large extent. Many cation-aryl interactions occur on the surface of proteins, and as a general rule Arg interacts more often than Lys, with Trp providing the aromatic interaction more often than Phe or Tyr. Methyl groups interact with the face of aromatic rings when bound to an electronegative atom. A positively charged nitrogen is a particularly strong electronegative substituent, and therefore, the direct interaction of an alkylated ammonium group with an aromatic ring leads to a strongly attractive interaction at a distance of 3.4-4.0 Angstroms. It might be expected that in acetylcholine esterase (ACHE) the binding of acetylcholine could be stabilised by an interaction between the quaternary ammonium and a carboxylate of Asp or Glu, in fact it is stabilised by an interaction with a tryptophan (below).
X-ray structure of Acetylcholine esterase showing quaternary ammonium moiety of acetyl choline (yellow) stabilised by cation-aryl interaction with a tryptophan (purple).
It is important to remember that whilst we draw the structure of acetylcholine with a formal positive charge on the nitrogen this is a misleading representation of the charge distribution.
The positive charge actually resides on the hydrogens of the methyl groups, this is perhaps best displayed using Torch using the XED forcefield. As shown in the image below, the red volume represents positive electrostatic charge whilst the blue volume represents negative.
Cresset technology centers on the application of the XED force field to the design of new small molecule bioactive compounds. Unlike traditional molecular mechanics, the XED approach uses a complex description of atoms to model charge away from atomic centers
In the case of protonated amines a similar interaction is possible, but in addition the hydrogen on the nitrogen often provides additional stabilisation via a hydrogen bonding interaction.
A publication describing macrocyclic ligands as Potent Cyclophilin Inhibitors based on Structural Simplification of Sanglifehrin A DOI highlights a novel binding mode in which the styrene moiety of Compound 7 engaged in a π-stacking interaction with Arg55 of cyclophilin A. PDB file 5TA4
The same interaction is displayed below using 3Dmol.js, the ligand is shown in green and ARG55 is coloured blue.
|Movement||Mouse Input||Touch Input|
|Rotation||Primary Mouse Button||Single touch|
|Translation||Middle Mouse Button or Ctrl+Primary||Triple touch|
|Zoom||Scroll Wheel or Second Mouse Button or Shift+Primary||Pinch (double touch)|
Aryl O-H or N-H Aryl interactions
Worth reading:- Interactions with Aromatic Rings in Chemical and Biological Recognition, Angew. Chem. Int. Ed. 2003, 42, No. 11
Whilst less well described than cation-aryl interactions there are a number of examples of anion-aryl interactions. There is a review "Anion-π interactions" DOI that highlights examples and theoretical considerations taken from supramolecular chemistry suggesting the binding interactions are comparable in energy to hydrogen bonds.
In sharp contrast to the mature area of cation binding to aromatic systems, anion-π interactions had hitherto been overlooked, primarily due to their counterintuitive nature (anions are expected to exhibit repulsive interactions with aromatic π-systems due to their electron donating character)
An interesting example of ligand binding to a biological macromolecule is found in the binding of phenyl diketo acids to Malate Synthase PDB. The diketo acid binds to the active site magnesium and Arg339 whilst the carboxylate portion of Asp633 residue packs face-on to the π-cloud of the aromatic ring, and the mean contact distance is ~3.5 Å (less that than the typical ~4.5 Å distance of hydrophobic contacts, suggesting an anion-π interaction. Modelling this interaction has been used to screen ligands DOI
Bonding Sulphur and Oxygen
Nonbonded interactions between sulphur atom and carbonyl oxygen atom have been observed in proteins and small molecules. In the GABA ligand below (J. Med. Chem. 2002, 45, 1887-1900 ), whilst the he C-5 and C-6 aromatic rings (interplanar torsions ca. 58 and 63°, respectively) were twisted out of plane with the central pyridone, the thiazole was clearly coplanar with the pyridone ring (interplanar torsion ca. 3°), with a close intramolecular contact of the sulphur and the carbonyl oxygen (S...O ) 2.76 Angstrom; (van der Waals radii S 1.80 Angstrom, O 1.52 Angstrom).
In these cases it is worth noting that replacing the S by either NH or O, which might be thought of as bioisosteric replacements, often fails to maintain biological activity as shown in the example taken from Abl Kinases in the table below.
In protein crystal structures the presence of methionine or cysteine sulphurs sitting over aromatic rings has been noted on a number of occasions
Bonding to Halogens
Halogens are present in around 25% of drugs, calculated using data from Guide to Pharmacology Database and often used as bioisosteric replacements for H, Methyl, OH and NH2.
Bonds to halogen are significantly weaker than hydrogen bonds but there are many examples in the PDB of carbonyls interacting with halogens with bonds to I, and Br predominating. It is perhaps worth noting that halogens have been introduced into ligands to aid X-ray analysis, but they may also influence binding. Based on the observed bond angles the interaction is between the halogen and pi-cloud of the carbonyl rather than the lone pair with a clear clustering of X--O=C-N dihedral angles of 90° associated with interactions that involve primarily the pi-system of the carbonyl. Of the 1500 compounds in the MDDR around 15% contain Cl, whilst Br and I account for a further 3%. Although these interactions may be thought of a weaker than hydrogen bonds you should bear in mind the impact on desolvation of the hydrogen bonding partners may have on the overall energy change. Desolvation of the halogen may be less of an issue and so the overall benefit may be higher.
Statistical evaluation of crystallographic data revealed that electrophiles preferentially form contacts with halogen (Cl, Br, I) moieties in a “side-on” fashion, while nucleophiles approach the halogens “head-on”. In general these sort of interactions are not modelled well by the atom-centric charges used by most forcefields, and exception is the extended forcefield XED developed by Cresset. Looking at the electrostatic fields for the halobenzenes (blue negative, red positive) we can see the electropositive surface extended from the Br and I atoms.
As predicted substitution of hydrogen by iodine can lead to the largest affinity gain, since the strength of the halogen bond increases with the size of halogen atom, as shown below introduction of an iodine into tubercidine affords a greater than 200-fold increase in affinity DOI.
Targeted replacement of one or more hydrogen atoms in a molecule with fluorine atoms is part of the medicinal chemists toolkit to improve affinity at a given target and/or to modulate parameters such as metabolic stability and pKa, this bioisosteric replacement has been extensively evaluated and it is perhaps not surprising that around 13% of drugs contain a fluorine atom, data from Guide to Pharmacology Database . Fluorine can interact with both polar and hydrophobic groups in proteins. These interactions can be further classified into polar interactions with hydrogen bond donors (e.g., backbone NH, polarized Cα–H, polar side chains, and protein bound water), hydrophobic interactions with lipophilic side chains, and orthogonal multipolar interactions, with backbone carbonyl groups, amide containing side chains (Asn and Gln), and guanidinium groups (Arg) and sulphur (Cys).
An interesting example is Harnessing Fluorine–Sulfur Contacts and Multipolar Interactions for the Design of p53 Mutant Y220C Rescue DrugsDOI
In this study, we aimed at improving the potency of the carbazole-based compound Phikan083 and employed ab initio quantum-chemical calculations to probe potential interaction energy gains upon fluorination of the ethyl anchor.
They investigated the successive fluorination of the pendant ethyl group, in the case of the trifluoromethyl analogue, shown below PDB = 54GO, the fluorines can be see interacting with Cys220 and with two backbone carbonyls with O=C···F angles of 97.5° and 80.1°.
A computational algorithm named FMAP, which calculates fluorophilic sites in proximity to the protein backbone has been developed DOI.
On the basis of the analysis of protein–ligand complexes from the Protein Data Bank (PDB), we developed an algorithm (FMAP) for mapping sites for fluorine atoms on protein structures to form favourable C–F···C═O interactions with the protein backbone. The geometric criteria used in FMAP have been selected to encompass ∼80% of fluorine sites found in the experimental structures in PDB. Fluorine sites are mapped onto a protein structure through a Pymol extension and are represented as a surface spanning 2.8–3.2 Å range from the peptide bond (Figure 2a). FMAP also eliminates unlikely fluorine positions through filters based on unfavourable geometry for multipolar interactions as well as steric clashes with protein atoms (see Methods for a detailed description of FMAP).
FMAP was used to evaluate the contributions of fluorine atoms to the binding affinity of MI-2-3, within a series of analogues replacing CF3 with CH3, CH2F, and CHF2 groups, the results are shown in the image below taken from the paper. The paper also underlines another important point "while only single fluorine may interact with backbone, other fluorines might be needed to stabilise the appropriate rotameric state". The script is freely available from the authors.
Some Examples from PDB
The image below taken from the protein data bank and visualised in MOE 3EQB shows the ligand forming I···O contact with the carbonyl oxygen of Val 127 in MEK1, showing the “head-on” interaction with a bond angle of 174o.
Similarly 3GT3 shows the ligand forming a Br···O contact with the hydroxyl group of Ser 247 in proteinase K.
The crystal structure 3KXH shows the interactions between Br and the carbonyls of Glu114 and Val116 and the Br···π contact with phenyl ring of Phe 113 in CK2 kinase
A detailed comparison of the Thermodynamic and Structural Characterization of Halogen Bonding in Protein−Ligand Interactions: A Case Study of PDE5 and Its Inhibitors DOI compared the corresponding H, F, Cl, Br and I analogues binding to the catalytic domain of PDE5. Comparison of the activities shows Iodo being the most potent and Fluoro the least active, actually worse than hydrogen. This is consistent with the electrostatic surfaces described above.
Overlay of X-ray structures are shown below (I = orange, Cl = green, F = yellow), the major interactions of the ligands with the protein are the classical bidentate H-bonds with Q817, π−π stacking interactions with the phenyl ring of F820, and hydrophobic interactions with residues L765, V782, A783, and F786. The iodo substituent binds to both Tyr612 but also binds to a tightly bound water molecule.
Worth reading Principles and Applications of Halogen Bonding in Medicinal Chemistry and Chemical Biology J Med Chem DOI
This is defined by IUPAC as
The tendency of hydrocarbons (or of lipophilic hydrocarbon-like groups in solutes) to form intermolecular aggregates in an aqueous medium, and analogous intramolecular interactions
From the analysis of the PDB ligands above, hydrophobic contacts are by far the most common interactions in protein–ligand complexes, the most common hydrophobic interaction is the one formed by an aliphatic carbon in the receptor and an aromatic carbon in the ligand. Leucine, followed by valine, isoleucine and alanine side-chains are the most frequently engaged in hydrophobic interactions.
Singh et al (American Journal of Immunology 4 (3): 33-42, 2008) have completed an extensive study on the binding interactions that are important for the binding of Tetrahydroimidazobenzodiazepinone (TIBO) which belongs to non-nucleoside group of reverse transcriptase inhibitors (NNRTIs). The study showed that hydrophobic interaction is predominant and made major contribution, while hydrogen bonding and polar interactions help in proper orientation of the compound, The "hydrophobic effect" (a term coined by Charles Tanford) refers to the idea that energetically protein folding is driven by two factors:- Hydrophobic side-chains prefer to "get away" from water, whilst. Hydrophilic side-chains prefer to interact with the water. When this is extended to the interaction of ligands with the binding site the classic concept of the hydrophobic effect is as follows: A hydrophobic ligand (or surface on the binding site) disrupts the structure of bulk water and decreases entropy because of stronger bonding and ordering of water molecules around the solute, if ligand and binding site associate then some of the water molecules can be returned to bulk water. Thus the hydrophobic effect is almost entirely entropic and has been correlated with the partition between aqueous and non-polar solvents.
Example ligand binding interactions
A recent publication from Mileni et al (DOI: 10.1021/jm9012196) rather nicely illustrates a whole range of molecular interactions and I’ve taken the pdb file (3K83) into MOE and highlighted some of the key interactions below. The catalytic mechanism involves nucleophilic attack of Ser241 onto the amide carbonyl of the substrate, in the case of the inhibitor below (Ki 22 nM), Ser241(yellow) forms a hemiacetal with carbonyl adjacent to the oxazole. This tetrahedral intermediate is stabilised by the electron-withdrawing heterocycle and hydrogen bonds to backbone N-Hs.
Interestingly, the carboxylic acid of the ligand does not interact with a basic amine but is instead stabilised by interaction with three water molecules and hydrogen bonds to two backbone N-H. The remaining key interactions are a series of Aryl interactions. Phe192 forms an edge to face pi-interaction with the pyridyl ring of the inhibitor and with one of the rings of the biphenyl group. Phe381 makes a key aryl CH-pi contact with the inhibitor biphenyl ring whilst two Met236 and Met495 orient their sulfur lone pair electrons toward the bound phenyl hydrogens and the hydroxyl of Thr377 also interacts with the phenyl hydrogens.
X-ray Crystallographic Analysis of alpha-Ketoheterocycle Inhibitors Bound to a Humanized Variant of Fatty Acid Amide Hydrolase
Medicinal Chemist’s Guide to Molecular Interactions. Bissantz, Kuhn, Stalh J Med Chem DOI
Quantifying Intermolecular Interactions : Guidelines for the Molecular Recognition Toolbox. Angew. Chem. Int. Ed. 2004, 43, 5310 – 5324, DOI
Principles and Applications of Halogen Bonding in Medicinal Chemistry and Chemical Biology J Med Chem DOI
A systematic analysis of atomic protein–ligand interactions in the PDB DOI
Last update 12 September 2018