Subscribe in a reader

Cambridge MedChem Consulting

This Week in Virology

An interesting weekly podcast that is currently topical.

This week Doris Cully joins TWiV to discuss inhibition of SARS-CoV-2 in cell culture by ivermectin.

COVID-19 Registered Trials

There are now a number of clinical trials underway and this review by The Centre for Evidence-Based Medicine provides an excellent summary of the trials that are taking place. They describe proposed pharmacological interventions and their mechanisms, when known, but unfortunately don't give the chemical structures.


I've also now included a few other structures that people have sent to me.

Here is the workflow I use to get the structures and access more information about the compounds.

Create a text file with all the structures mentioned


Now read the text file into Vortex


The use a Name to Structure script to use a web service to get the structures, in this case I used ChemSpider. Now generate the InChiKey from the structures.


We can now use the InChiKey to search UniChem using another Vortex script to get identifiers for the molecule from various databases.

UniChem efficiently produces cross-references between chemical structure identifiers from different databases


We can then use the identifiers to search the various databases for more information

I've been asked if I could provide the structures for download

Here it is in SDF file format

And in SMILES format

The quality of the crystal structure is critical

Crystal structures are not perfect, and it is important to understand the limitations and not assume as Derek Lowe once put it, they are a "message from God". It might be worth reading the section on structure-based design.

With this in mind I thought I'd flag this message from Bobby Glen (Cambridge) here.

Hi, we’re still (Jason at CCDC) porting GOLD to our HPC system so we can basically parallel dock. We should be able to dock and score early next week I hope, There are a few issues we also are addressing wrt the crystal structures, Gerard Bricogne at Global Phasing is kindly re-refining the published structure from the ED, this hopefully will inform us of for making some changes to the orientation/pKa and tautomers of the histidines and some of the other AAs. It’s very difficult to ‘see’ hydrogen in x-ray and these are inferred from the structure. We need to be sure we have a decent model of this (at physiological pH) before doing all the calculations. An example is H163, which is in the binding site, and is critical to a few of the interactions seen in ligands for this class of proteases. Automated hydrogen addition can be problematic.

Help design inhibitors of the SARS-CoV-2 main protease

Are you a medicinal chemist currently locked out of your lab?

Why not take a break from writing papers/reports and lend your expertise to this effort, They have identified 60 fragment hits and are asking for insight in what should be made next.

We are now asking for your help in designing new inhibitors based on these initial fragment hits: the exceptionally dense readout suggests countless opportunities for growing and merging, and we need many sharp brains to sift through them; it is also what makes us believe that potency can be directly achieved.

The first round of submissions will be reviewed tonight and the selected molecules will be made by Enamine.

Structures of SARS-CoV-2 ligands PYMOL session files

One of the best drug targets among coronaviruses is the main protease (Mpro), this enzyme is essential for processing the polyproteins that are translated from the viral RNA and the recognition sequence at most sites is Leu-Gln↓(Ser,Ala,Gly) and since no human enzymes have similar specificity inhibitors should be very specific. Mpro is a papain-like protease cysteine protease

I've previously described the fragment hits from a fragment screen against crystals of the main protease (MPro) of SARS-CoV-2, the virus that causes COVID-19. Full details of the screening effort are described here

I've downloaded all the structures that were screened, both those that bind and those where no binding was observed and put them into a single file, also added inChiKey, SMILES, PubChem ID, PDB ID of ligand if known and a range of other identifiers from different databases, the file is available here

Whilst that is probably sufficient for those looking at cheminformatics driven approaches to designing new molecules anyone wanting to undertake structure based design would need to download all the structures and then overlay them to visualise on their desktop. Fortunately Manish Sud of MayaChemTools has done the hard work and generated a series of PYMOL session files that allow you explore the enzyme crystal structure and the screening data interactively.


PYMOL is an open source molecular visualisation application, you can download it here or install using conda

conda install -c schrodinger pymol

If you have not used it before there is a tutorial here

The PyMOL session files are setup to facilitate the analysis of protein ligand interactions in the binding pocket, to view the files select "Open" from the file menu bar, some of the larger files make take a little while to load.

X-ray crystal structures and electron densities

COVID-19 main protease with unliganded active site (2019-nCoV, coronavirus disease 2019, SARS-CoV-2) 6Y84 and the crystal structure of COVID-19 main protease in complex with an inhibitor N3 6LU7.

The PYMOL session files (zipped) can be downloaded here

Structures for non-covalent ligands

The structures of the non-covalent ligands are here.

If you are not familiar with fragment-based screening there is an introduction here including some examples of fragment growing.

It is likely that fragments will only have very modest affinity and that to completely suppress the enzyme it will require very high affinity ligands with good pharmacokinetics to achieve 100% occupancy for 24 hours per day. For this reason molecules that irreversibly bind to the enzyme might be an attractive alternative option.

Structures for covalent ligands are here

The session file containing the covalent ligands are here

This is a large file so Manish has divided it.

Whilst much of drug discovery deals with non-covalent, reversible interactions with the target protein there are also a class of therapeutic agents that bind covalently to the target protein, these are described on this page. To mitigate the risk of off-target toxicity you will need to maximise the selectivity for the target enzyme. Glutathione conjugation can be used as a surrogate for off-target reactivity.

Getting designs made

Once you have designed a novel ligand have a look at Design a Compound, We Will Make It

Designs will be prioritized by factors, such as ease of synthesis, and toxicity modeling, then synthesized by Enamine and tested by groups around the world. PostEra will be running machine learning algorithms in the background to triage suggestions and generate synthesis plans to enable a rapid turnaround. You will be informed of the progress of the molecules through the main stages (validation, synthesis and testing).

COVID-19 Open Research Dataset Challenge (CORD-19)

There are a number of COVID-19 Kaggle challenges open at the moment,

One of the more recent is:-

COVID-19 Open Research Dataset Challenge (CORD-19)

There is a large body of research and literature continuously evolving around COVID-19. Help the research community and global organizations better digest this to answer key questions."

In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 29,000 scholarly articles, including over 13,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up.

You can read more about it here

favipiravir shows promise in treatment of COVID-19

Favipiravir, also known as T-705, Avigan, or favilavir is a drug designed to treat RNA viral infections DOI and DOI. It is phosphoribosylated by cellular enzymes to its active form, favipiravir-ribofuranosyl-5'-triphosphate (RTP) and inhibits the RNA-dependent RNA polymerase.


Favipiravir has recently been reported to be effective in the treatment of coronavirus patients Link. It appears to be effective in patients showing mild to moderate symptoms, but not effective in patients showing more severe symptoms.

A search of UniChem using the InChiKey gives details of all identifiers and links to clinical studies.

A number of clinical trials have been completed or are ongoing on and can be found here.

Whilst it appears to be safe and well tolerated, and it has been approved for flu it has not yet been approved for COVID-19,

Fragment hits for SARS-CoV-2

A group of researchers including Dave Stuart, Martin Walsh, and Frank von Delft (Diamond Light Source) has performed a fragment screen against crystals of the main protease (MPro) of SARS-CoV-2, the virus that causes COVID-19. Even before fully analyzing all of the data they have released interim results

The hits can be viewed on fraglaysis here.

I've downloaded all the structures that were screened, both those that bind and those where no binding was observed and put them into a single file, also added inChiKey, SMILES, PubChem ID, PDB ID of ligand if known and a range of other identifiers from different databases, the file is available here


Potential 2019-nCoV 3C-like Protease Inhibitors

Chris Southan recently flagged a number of publications describing possible treatments for 2019-nCoV using repurposed existing drugs "Therapeutic options for the 2019 novel coronavirus (2019-nCoV)" DOI. In addition a recent preprint "Potential 2019-nCoV 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches" DOI highlighted the design of potential protease inhibitors. The authors provide the structures of the molecules in the supplementary informations.

I downloaded the sdf file and used a Jupyter notebook to calculate a range of physicochemical properties, the results are shown in the plot below.


As often seen with protease inhibitors, the molecular weight is rather high with the majority of compounds having Mol Weight >500. The calculated LogP is mainly in the range 2 to 5, however because 40% of the molecules are predicted to be basic the LogD is mainly in the range 0-4. This combination of high molecular weight and rather high LogP is likely to compromise the developability score (for more details on develop ability score read "20 years Rule of Five" report here.


The molecules are also predicted to have a rather high number of hydrogen bond donors and acceptors, this contributes to high polar surface area (TPSA). In general TPSA >120 are often associated with poor oral bioavailability, however it should be noted that the TPSA was not calculated on a 3D structure and it is possible that intramolecular hydrogen bonds may reduce the actual TPSA, also described in the 20 years Rule of Five report.

Scanning through the molecules I noticed a number of functional groups that might be a concern (e.g. Micheal Acceptors), I ran a couple of Vortex scripts that flag potential problematic groups based on SMARTS taken from the following publications

I also ran a couple scripts that flag potential liver toxicity or HERG liabilities. These flags should not be used to exclude molecules but should be used to flag molecules for checking experimentally. The script identifies the functional group that has been flagged as a liver toxicity liability, and identifies the most similar molecule in ChEMBL that has HERG activity. The results are shown in the image below.

Screenshot 2020-02-13 at 07.52.44

I also added InChiKeys for better cross referencing.

I've exported all results as an sdf file which can be found here