Whilst much of drug discovery deals with non-covalent, reversible interactions with the target protein there are also a class of therapeutic agents that bind covalently to the target protein, these are distinct from ligands that have very slow off rates. Depending on the type of covalent interaction the binding can range from readily reversible to effectively irreversible, where restoration of activity requires new protein synthesis. Whilst covalent binding has many attractions, long duration of action, impervious to build-up of natural substate, potential for much lower prolonged systemic exposure and excellent selectivity if targeting a non-conserved residue. However, they do raise concerns over potential toxicity caused by non-specific reaction with proteins or DNA etc. Idiosyncratic toxicity has long been studied and in most instances, the liver injury is initiated by the bioactivation of drugs to chemically reactive metabolites, which have the ability to covalently react with cellular macromolecules such as proteins, lipids, and nucleic acids, leading to protein dysfunction, lipid peroxidation, DNA damage, and oxidative stress.
There is an increased interest in covalent therapeutics in the literature as shown Figure 1, using data from PubChem searches.
It is now apparent that an increasing number of marketed drugs target enzymes act as covalent inhibitors, a publication in 2005, Mechanistic Basis of Enzyme-Targeted Drugs DOI many of which were not originally designed as covalent inhibitors.
From an analysis of the FDA Orange Book, there are 317 marketed drugs that work by inhibiting an enzyme. These drugs inhibit 71 enzymes, including 48 human, 13 bacterial, five viral, four fungal, and one protozoal enzyme. Among the 317 drugs, 65% either undergo reactive chemistry in the active site of the target enzyme or contain a structural motif related to the substrate. Among the 71 enzyme targets, 25 are irreversibly inhibited by drugs, and 19 of the 25 irreversibly inhibited enzymes are covalently modified by the drug.
A number of enzyme classes contain nucleophilic residues within the active site including serine proteases, cysteine proteases which also have a histidine as part of the catalytic triad. Other enzymes contain histidine or lysine in the active site so it is perhaps not surprising that most of the covalent ligands contain an electrophile. A selection of approved covalent inhibitors together with the biological target are shown below.
The beta lactam antibiotics (Penicillin and Cephalosporins) are a very well studied class of therapeutic agent, the mechanism of action is the inhibition of cell wall synthesis. Penicillin inhibits the formation of peptidoglycan cross-links in the bacterial cell wall; this is achieved through reaction of the β-lactam ring of penicillin to the enzyme DD-transpeptidase. As a consequence, DD-transpeptidase cannot catalyze formation of the cell wall cross-links.
One of the mechanisms for beta lactam antibiotic resistance is through Penicillinase a specific type of β-lactamase, showing specificity for penicillins, by hydrolysing the β-lactam ring. This lead to another class of compounds designed to inhibit Penicillinase. Like beta-lactam antibiotics, they are processed by beta-lactamases to form an initial covalent intermediate, however the oxygen acts as a leaving group revealing an electrophilic imine that can react with the beta lactamase. This class of compounds have little intrinsic antibiotic activity but enhance the activity of other beta-lactam antibiotics.
This is an example where activation is required to reveal the electrophilic species.
Clopidogrel is used to reduce the risk of heart disease and stroke in those at high risk, it acts by inhibiting the P2Y12 receptor on platelet cell membranes. It is a prodrug, which requires CYP2C19 oxidation for its activation. Oxidation of the thiophene ring followed by tautomerisation and ring opening reveals a free thiol group which forms a disulphide bond with a cysteine on the P2Y12 receptor.
Covalent Inhibitor Drug Discovery
Covalent therapeutics have particular attraction in some therapeutic areas, for example, cancer and anti-invectives where prolonged inhibition of the molecular target is essential without high systemic exposure. Or cases where the natural ligand has extremely high affinity or is present in high concentrations (e.g. kinases). In addition covalent inhibitors may offer advantages when it does to drug resistance. Whilst mutations may slow the binding of the inhibitor, providing the nucleophilic residue ids still present irreversible inhibition will still occur eventually. With a covalent inhibitor the duration of action is largely governed by resynthesis of the target protein, if a protein is rapidly turned over then covalent inhibition may have little advantage.
A number of enzyme classes contain nucleophilic residues within the active site including cysteine, serine, histidine or lysine so it is perhaps not surprising that most of the covalent ligands contain an electrophile. If the target protein contains a potential nucleophile in or near the active site it may be suitable for covalent inhibition. More recently efforts have been undertaken to screen the entire proteome to identify targets that might be susceptible to covalent inhibition DOI
Here we report a quantitative analysis of cysteine- reactive small-molecule fragments screened against thousands of proteins in human proteomes and cells. Covalent ligands were identified for >700 cysteines found in both druggable proteins and proteins deficient in chemical probes, including transcription factors, adaptor/scaffolding proteins, and uncharacterized proteins.
Screening for covalent ligands
High-throughput screening is a popular approach for hit-finding but if the objective is to find covalent ligands then it is likely that any molecules containing potential reactive groups will have been removed from the screening collection. Thus the most common strategy is to incorporate an electrophile into an already optimized reversible ligand. Screening for covalent ligands comes with a number of inherent risks and it is essential to have clear strategy to distinguish between genuine covalent inhibitors and false positives that might be interfering with the assay. It is also important to check for off-target covalent binding early in the screening process.
In an effort to identify potentially promiscuous electrophiles a recent publication DOI screened approximately 1000 electrophiles (focusing on two mild electrophilic 'warheads': acrylamides and chloroacetamides) against ten different proteins using mass spectrometry detection. They also screened for high reactivity and showed that whilst chloroacetamides were more reactive there was only really a 30-fold difference covering most of the library. An additional advantage of covalent binding fragments is that it may ease co-crystal determination when compared to reversible fragments with low residence time in the binding site.
To address a major concern in covalent-molecule screening, namely that high reactivity would lead to a high proportion of irrelevant hits, we developed a high-throughput thiol-reactivity assay in order to assess the reactivity of all fragments in our library. This entails incubating fragments with reduced DTNB (Ellman's Reagent; 5,5-dithio-bis-2-nitrobenzoic acid) and following the absorbance of TNB2- (at 412 nm wavelength) for up to seven hours.
The electrophiles were tested against a diverse range of Cys-containing proteins. Only 27 of the electrophilic fragments are promiscuous (label 2 or more proteins by 50% or 3 more proteins by more than 30%) and it seems that promiscuity does not correlate with reactivity.
The hit rates obtained with this library are on-par or slightly higher than observed in screens with non-covalent fragments 65,66: 2-4% for NNMT, OTUB2 and NUDT7; and 0.2-0.9% for other proteins. These would be attenuated by screening at different concentrations: here, all primary screening was performed at 200 μM, but based on the results, we can now recommend a concentration of 100 μM when targeting catalytic cysteines, while staying at 200 μM for less nucleophilic target cysteines.
The 27 promiscuous (label 2 or more proteins by 50%, or 3 more proteins by more than 30%) compounds are shown below.
The Cysteine Covalent library is available on the XChem website and I downloaded the sdf file and generated the profile below. In general this is a typical fragment library with MWt below 300, and LogP below 3. However this library is a perhaps unsurprising lacking in ionisable groups.
This library was used to screen for hits against OTUB2 a deubiquitinase (47 hits) and NUDT7 a peroxisomal CoA pyrophosphohydrolase (36 hits).
Among the NUD7 5 hits sharing a common 2-phenylpyrrolidine motif stood out, with four compounds labeling 100%
These were merged with a non-covalent hit obtained from a previous fragment screen DOI shown below.
An library of fragments linked to α,β-unsaturated methyl ester electrophile has also been used to identify Inhibitors for the RBR E3 Ubiquitin Ligase HOIP DOI, this library was based on >8000 carboxylic acids from the GSK compound collection which were initially filtered based on a few calculated physicochemical properties derived from their SMILES strings, e.g. molecular weight, clogP, hydrogen bond acceptors and donors and aromatic rings and predicted toxicophores. These were then clustered and representative examples from each cluster selected and used to synthesise 106 electrophilic fragments. DMSO stock solutions of the synthesized compounds were monitored over time to conclude that the vast majority were stable for >6 months in DMSO when stored at 4 °C. The library was pooled into 22 groups of 4–5 compounds (each compound separated in molecular weight by at least 5 mass units) and screened using mass spec detection. The profile of the library is shown below, again the fragments contain no ionisable functional groups.
The hit below was identified showing 85% inhibition after 24h incubation (10-fold excess over protein).
Another way to evaluate a covalent inhibitor if the putative target is a cysteine residue in the active site it may be possible to mutate the CYS to SER and reduce covalent binding.
Virtual screening for covalent hits has also been undertaken DOI, identifying submicromolar to low-nanomolar hits with high ligand efficiency, cellular activity and selectivity, including covalent inhibitors of JAK3.
Design of covalent inhibitors.
If we assume that you don't want to rely on general metabolic activation in the liver then the approach undertaken is to identify a specific substrate and attach a reactive warhead. There are a variety of reactive functional groups, the most reactive will certainly react with the target protein but may also react with a variety of off-target proteins.
A wide range of reactive functional groups have been used some of which favour specific amino-acid side chains. Some are slowly reversible which may help with off-target toxicity.
The anti-cancer agent Carfilzomib irreversibly binds to and inhibits the chymotrypsin-like activity of the 20S proteasome, an enzyme that degrades unwanted cellular proteins. Inhibition of proteasome-mediated proteolysis results in a build-up of polyubiquinated proteins, which causes cell cycle arrest, apoptosis, and inhibition of tumor growth.
E-64 is an epoxide which can irreversibly inhibit a wide range of cysteine peptidases. The compound was first isolated and identified from Aspergillus japonicus in 1978. It has since been shown to inhibit many cysteine peptidases such as papain, cathepsin B, cathepsin L, calpain and staphopain and is used as a biochemical tool.
Peptidyl chloromethyl ketones were among the first affinity labels developed for serine proteases, however due to the inherent chemical reactivity of the chloroketone functional group, the major disadvantage is their lack of selectivity. Thienylhalomethylketones (HMK-32), have been reported, as irreversible inhibitors of GSK-3β DOI reacting with Cys199 in the active site. This observation was used to convert the reversible inhibitor (1) into an irreversible inhibitor (2) DOI.
The use of unsaturated carbonyls as Micheal acceptors is particularly popular because by changing the double-bond substitution it is possible to fine tune the reactivity. There is an excellent review of thiol reactivity of Michael acceptors possessing an α,β- unsaturated carbonyl group DOI.
Telaprevir (purple in image below) is a potent HCV protease inhibitor that forms a stabilised (but reversible) semi acetal with the active site Ser139 (shown in green below) [PDB 3SV6]. In contrast, the sulphonamide of compound (3) (in orange below) forms a hydrogen bonding network with Ser139, but the acrylamide reacts with Cys159 (yellow) [PDB3OYP] . Mutation of Cys159 to Ser results in a dramatic reduction in potency.
Gefitinib and Erlotinib are first generation EGRF inhibitors, EGFR is over expressed in the cells of certain types of human carcinomas - for example in lung and breast cancers, and both agents have been used in to treat ling cancer. As with other ATP competitive small molecule kinase inhibitors, patients rapidly develop resistance. In the case of erlotinib this typically occurs 8–12 months from the start of treatment. Over 50% of resistance is caused by a mutation in the ATP binding pocket of the EGFR kinase domain involving substitution of a small polar threonine residue with a large nonpolar methionine residue (T190M). The covalent inhibitor Afatinib covalently targets Cys-797 which is positioned in the solvent channel of the kinase. Afatinib is not only active against EGFR mutations targeted by first generation tyrosine-kinase inhibitors (TKIs) like erlotinib or gefitinib, but also against mutations such as T790M which are not sensitive to these standard therapies. Similar acrylamide functionality is used in the EGFR inhibitors rociletinib, dacomitinib, neratinib.
Whilst the basic dimethylamino in Afatinib is important in improving solubility it has been suggested that this functionality may also enhance reactivity with glutathione DOI via an intramolecular protonation mechanism shown below. This can be avoided by either ring constraints or by moving the position of the basic amine as in Osimertinib.
Protein kinases are an important class of oncology targets, however kinase selectivity within the over 500 in the human genome is a concern. Since all use the same cofactor (ATP), the pocket is structurally similar among all kinases, therefore selectively targeting a single kinase can be challenging. Designing compounds to covalently bind to poorly conserved residues such as a non-catalytic cysteine adjacent to the active site offers potential gains in selectivity for a target kinase over kinases not containing a cysteine in that position.
α-Cyan Acrylamide Warhead
Vinyl sulphone and Nitriles
Cathepsin K cleaves Type I collagen at multiple positions, the recognition sequence is Gly-Leu-Lys-Gly-His, using an optimised recognition sequence and a vinyl sulphone war head gives an irreversible inhibitor DOI.
Lysine is more abundant than cysteine and is a harder nucleophile, therefore targeting lysine offers alternative routes to covalent inhibition, however only the neutral form of lysine will act as a nucleophile (pKa 10.4). Whilst surface lysine residues are likely to be protonated, buried lysines in hydrophobic environments are less solvent accessible and may favour the neutral form. Model reactions using N-a-acetyl-lysine and glutathione have suggested that lysine is less reactive than cysteine towards soft electrophiles such as acrylamides, but more reactive towards vinyl sulphones or sulphonyl fluorides.
Using a similar strategy Odanacatib has an electrophilic nitrile that acts covalently with a Cys in Cat K .
It is worth noting that Odanacatib has excellent selectivity over other Cathepsins DOI.
CatK = 0.2 nM, Cat B = 1034 nM, Cat L = 2995 nM, Cat S = 60 nM.
An important recent example is Nirmatrelvir, It is part of a nirmatrelvir/ritonavir combination used to treat COVID-19 and sold under the brand name Paxlovid. After viral entry, the positive genomic RNA of SARS-CoV-2 is translated into two large polyproteins, which are then processed by proteolysis into components for packaging new virions. This proteolysis is controlled by two protease enzymes, the coronavirus main protease (Mpro) and the papain-like protease (PLpro). Mpro acts at 11 cleavage sites and is thus key target for drug discovery. It is a dimer of two identical subunits that together form two active sites. The protein fold is similar to serine proteases like trypsin, but a cysteine amino acid and a nearby histidine are responsible for the amide bond cleavage.
Nirmatrelvir forms a covalent bond with the active site Cys145 of the enzyme.
Strain Release Motif Warhead
An alternative strategy is the use of strain-releasing motifs, the bicyclo [1.1.0] butane (BCB) derivatives can be used used as electrophiles. This moiety is stable under aqueous conditions and folds into a butterfly shape through bridged carbon–carbon bonds. The bridgehead carbon can undergo nucleophilic addition to the nucleophile cysteine thiol to open the ring DOI. The strain-driven nucleophilic addition to BCB amides proceeded chemoselectively with cysteine thiols under neutral aqueous conditions, the rate of which was significantly slower than that of acrylamide.
A nitroalkane has been shown to react with an active-site cysteine residue to yield a thiohydroximate adduct. DOI.
The boron atom in bortezomib binds the catalytic site of the 26S proteasome with high affinity and specificity, the boronic acid ensures high affinity for hard oxygen nucleophiles in contrast to soft cysteine nucleophiles, according to the Lewis hard-soft acid-base principle. [PDB 2F16].
Whilst the beta-lactams are perhaps the best known of the suicide substrate based irreversible inhibitors other mechanism are known. Pargyline is an irreversible selective monoamine oxidase (MAO)-B inhibitor drug that reacts with the redox cofactor flavin after initial oxidation.
The natural product Physostigmine and the synthetic analogue Rivastigmine both act as substrates for acetylcholinesterase, whilst the acylated intermediate resulting from reaction with acetylcholine is rapidly hydrolysed to regenerate the enzyme, the corresponding carbonates resulting from reaction with Physostigmine or Rivastigmine are relatively stable thus inhibiting the enzyme.
Rivastigmine has modest human pharmacokinetics DOI with a half-life of around 1 hour, however the duration of action is 12 h in AD patients due to the very slow hydrolysis of the carbamate.
CovBinderInPDB: A Structure-Based Covalent Binder Database
CovBinderDB DOI is a fantastic resource mined from the PDB that contains 7375 covalent modifications in which 2189 unique covalent binders target nine types of amino acid residues (Cys, Lys, Ser, Asp, Glu, His, Met, Thr, and Tyr) from 3555 complex structures of 1170 unique protein chains. The database can be accessed here https://yzhang.hpc.nyu.edu/CovBinderInPDB/. As you might expect Cysteine is the most common covalently bound amino acid, followed by Serine.
Whilst a re variety of covalent binding warheads have been identified the therapeutically used acrylamides are significant proportion.
Computational Prediction of covalent Inhibitors
A computational pipeline has been described by the London lab to predict suggest covalent analogs of non-covalent ligands DOI.
Designing covalent inhibitors is increasingly important, although it remains challenging. Here, we present covalentizer, a computational pipeline for identifying irreversible inhibitors based on structures of targets with non-covalent binders. Through covalent docking of tailored focused libraries, we identify candidates that can bind covalently to a nearby cysteine while preserving the interactions of the original molecule. We found ∼11,000 cysteines proximal to a ligand across 8,386 complexes in the PDB. Of these, the protocol identified 1,553 structures with covalent predictions. In a prospective evaluation, five out of nine predicted covalent kinase inhibitors showed half-maximal inhibitory concentration (IC50) values between 155 nM and 4.5 μM. Application against an existing SARS-CoV Mpro reversible inhibitor led to an acrylamide inhibitor series with low micromolar IC50 values against SARS-CoV-2 Mpro. The docking was validated by 12 co-crystal structures. Together these examples hint at the vast number of covalent inhibitors accessible through our protocol.
RDKit was used for 2D molecular handling, conformation generation and RMSD calculation. RDKit: Open-source cheminformatics; version 2018.09.3; RDKit.org. Marvin was used in the process of preparing the molecules for docking, Marvin 17.21.0, ChemAxon (https://www.chemaxon.com). OpenBabel (http:// openbabel.org/wiki/Main_Page) was used to switch between molecular file formats. DOCKovalent (London et al., 2014) was used for virtual covalent docking. The Covalentizer code is available at https://github.com/LondonLab/Covalentizer.
Mitigating Toxicity Risks
Maximise selectivity of binding to the target protein, this is key in light of emerging data that suggest that off-target binding leads to a greater risk of toxicity DOI.
We find that covalent kinase inhibitors, including approved drugs, have defined, but limited concentration windows across which selective target inhibition can be achieved. Once this selectivity range is breached, substantial off-target protein reactivity and kinase target-independent cytotoxicity is observed. Our results thus indicate that medicinal chemistry efforts aimed at optimizing the selectivity of covalent kinase inhibitors should account for their reactivity across the entire human proteome in order to ensure suitable windows of selectivity for basic pharmacology and drug development initiatives.
Reduce dose as much as possible. Reviews of the characteristics of drugs that have been withdrawn from the market because of an unacceptably high incidence of toxicity, or which have received “black box” warnings for toxicity, have highlighted the role of daily dose.
On an empirical basis, drugs administered at a total dose of 10 mg per day are unlikely to be associated with a high incidence of IDRs
Glutathione conjugation can be used as a surrogate for off-target reactivity.
Minimal formation of reactive metabolites. There is an increasing body of information that implicates reactive metabolites as the mediators of a number of drug-induced toxicities,
In a study of compounds withdrawn due to hepatotoxicity in 5 out 6 cases there was evidence of reactive metabolite
Osimertinib is a third-generation epidermal growth factor receptor (EGFR) tyrosine kinase inhibitor designed to target the specific T790M mutation. A recent publication DOI used proteomics to look at off-target activities.
Using chemical proteomics, we show here that individual T790M-EGFR inhibitors exhibit strikingly distinct off-target profiles in human cells. The FDA-approved drug osimertinib (AZD9291), in particular, was found to covalently modify cathepsins in cell and animal models, which correlated with lysosomal accumulation of the drug.
So whilst the drug appears selective in in vitro assays, because the drug accumulates in the lysosome the selectivity is eroded.
Despite the highly engineered EGFR mutant inhibition profile achieved by all three third-generation inhibitors and their shared unsubstituted acrylamide reactive group, the inhibitors exhibited strikingly distinct proteome-wide reactivity profiles in human cancer cells.
Irreversible Inhibitors of Serine, Cysteine, and Threonine Proteases, Chemical Reviews, 2002, Vol. 102, No. 12 p 4369 DOI.
Proteome-wide covalent ligand discovery in native biological systems, Nature, 2016, Vol. 534, p 570 DOI
Covalent Modifiers: A Chemical Perspective on the Reactivity of α,β- Unsaturated Carbonyls with Thiols via Hetero-Michael Addition Reactions, DOI
Structure-based design of targeted covalent inhibitors. DOI
Emerging and Re-Emerging Warheads for Targeted Covalent Inhibitors: Applications in Medicinal Chemistry and Chemical Biology DOI
Rapid Covalent-Probe Discovery by Electrophile-Fragment Screening DOI
Covalent Modifiers Blog Link
Covalent Warheads Targeting Cysteine Residue: The Promising Approach in Drug Development DOI
Last Updated 19 January 2023