Subscribe in a reader

Cambridge MedChem Consulting

Discovery of novel antibiotic Halicin using deep learning

A recent paper has caught a lot of attention recently "A Deep Learning Approach to Antibiotic Discovery" DOI from Regina Barzilay's group at MIT. They used a deep neural network model to predict growth inhibition of Escherichia coli using a collection of 2,335 molecules, the molecules were described using Morgan fingerprints, computed using RDKit, for each molecule using a radius of 2 and 2048-bit fingerprint vectors. Using this methodology they identified the known c-Jun N-terminal kinase inhibitor SU3327 which they renamed Halicin. A quick search using MolSeeker allowed identification of the structure and inChiKey.


A search of UniChem using the InChikey NQQBNZBOOHHVQP-UHFFFAOYSA-N identified a number of other identifiers in different databases.


Including a link to the ChEMBL entry CHEMBL510038 giving the biological data 0.7 nM Inhibition of c-Jun N-terminal kinase by time-resolved FRET assay, and links to the original 2009 publication DOI describing the c-JNK SAR. The compound has a rat half-life of 0.45 h. There is another publication that might be of interest describing "Discovery of 2-(5-nitrothiazol-2-ylthio)benzo[d]thiazoles as novel c-Jun N-terminal kinase inhibitors" DOI.

Certainly an interesting approach, I suspect the nitrothiazole functionality would set off a few structural alerts but there are certainly of plenty of similar compounds commercially available that would allow exploration of the SAR without too much investment in resources.

All code and data is available on GitHub and there is also a website where you can test your own molecules


Potential 2019-nCoV 3C-like Protease Inhibitors

Chris Southan recently flagged a number of publications describing possible treatments for 2019-nCoV using repurposed existing drugs "Therapeutic options for the 2019 novel coronavirus (2019-nCoV)" DOI. In addition a recent preprint "Potential 2019-nCoV 3C-like Protease Inhibitors Designed Using Generative Deep Learning Approaches" DOI highlighted the design of potential protease inhibitors. The authors provide the structures of the molecules in the supplementary informations.

I downloaded the sdf file and used a Jupyter notebook to calculate a range of physicochemical properties, the results are shown in the plot below.


As often seen with protease inhibitors, the molecular weight is rather high with the majority of compounds having Mol Weight >500. The calculated LogP is mainly in the range 2 to 5, however because 40% of the molecules are predicted to be basic the LogD is mainly in the range 0-4. This combination of high molecular weight and rather high LogP is likely to compromise the developability score (for more details on develop ability score read "20 years Rule of Five" report here.


The molecules are also predicted to have a rather high number of hydrogen bond donors and acceptors, this contributes to high polar surface area (TPSA). In general TPSA >120 are often associated with poor oral bioavailability, however it should be noted that the TPSA was not calculated on a 3D structure and it is possible that intramolecular hydrogen bonds may reduce the actual TPSA, also described in the 20 years Rule of Five report.

Scanning through the molecules I noticed a number of functional groups that might be a concern (e.g. Micheal Acceptors), I ran a couple of Vortex scripts that flag potential problematic groups based on SMARTS taken from the following publications

I also ran a couple scripts that flag potential liver toxicity or HERG liabilities. These flags should not be used to exclude molecules but should be used to flag molecules for checking experimentally. The script identifies the functional group that has been flagged as a liver toxicity liability, and identifies the most similar molecule in ChEMBL that has HERG activity. The results are shown in the image below.

Screenshot 2020-02-13 at 07.52.44

I also added InChiKeys for better cross referencing.

I've exported all results as an sdf file which can be found here

SCI-RSC Workshop on Computational Tools for Drug Discovery 2020

The SCI Fine Chemicals Group and RSC Chemical Information and Computer Applications Group are organising a second Workshop on Computational Tools for Drug Discovery. The meeting format will be the same as the very successful meeting run in Birmingham in 2019.

The 2020 workshop will be held on 19 May 2020 at Riverside West, Whitehall Road, Leeds , West Yorkshire, LS1 4AW.


The SCI Fine Chemicals Group and RSC Chemical Information and Computer Applications Group are organising a second Workshop on Computational Tools for Drug Discovery. The meeting format will be the same as the very successful meeting run in Birmingham in 2019.

The 2020 workshop will be held on 19 May 2020 at Riverside West, Whitehall Road, Leeds , West Yorkshire, LS1 4AW,

The Workshop Providers and Facilitators are

  • Al Dossetter, MedChemica
  • Greg Landrum, KNIME
  • Gunther Stahl, OpenEye
  • Ilenia Giangreco, CCDC
  • Matt Segall, Optibrium
  • Stuart Firth-Clark, Cresset

Attendees will be able to choose from 4 of 6 sessions.

To select which workshops you would like to attend for each session, please complete the survey on the website. Please note that spaces are allocated on a first-come, first-served basis.

More details of the workshops and registration details are on the website shown below

Who’s sharing their clinical trial results?

An interesting recent publication "Compliance with legal requirement to report clinical trial results on a cohort study" DOI has highlighted the failure of many institutions to report the results of clinical trials within 1 year of completion as required by law.

4209 trials were due to report results; 1722 (40·9%; 95% CI 39·4–42·2) did so within the 1-year deadline. 2686 (63·8%; 62·4–65·3) trials had results submitted at any time. Compliance has not improved since July, 2018.

Thus nearly 60% of trials are not reported within the deadline, they also looked at the relative compliance of the different sectors

Industry sponsors and large (experienced) sponsors were most likely to report trial data, whereas universities were the least likely. The sponsor with the lowest compliance was the US government.

To aid monitoring they have produced FDAAA trials Tracker which allows anyone to check compliance.


31st symposium on Medicinal Chemistry in Eastern England

The Symposium on Medicinal Chemistry in Eastern England, known colloquially as the "Hatfield MedChem" meeting, is a highly successful, long-standing, one-day meeting which runs annually. The scientific program comprises of presentations showcasing medicinal chemistry case studies from tools to candidates, across a range of modalities, therapeutic areas and target classes, as well as covering more general topics from the forefront of drug discovery of relevance to medicinal chemists. The meeting aims to be informal and interactive and the event will offer excellent scientific and networking opportunities for all those working in medicinal chemistry and drug discovery.

It will take place on Thursday 30th April 2020 at The Fielder Centre, Hatfield, Hertfordshire, UK

Registration is now open.

Full details of the scientific programme and registration details are on the website

Always a very popular meeting so registration early is recommended.

Twitter hashtag #HatfieldMedChem20

Phenotypic Screening now offered by the European Lead Factory

The European Lead Factory has announced that it can now offer two types of phenotypic screening:

  • A high-throughput, but “lower content” phenotypic approach that is suited to screening ELF’s entire compound collection, and
  • A more complex “high content” screening approach using microscopy or flow cytometry to probe phenotype on a smaller subset of the compound collection

While low content assays can be live measurements or have fixed end points and involve well-averaged readouts, high content assays can be much more complex, based on live or fixed cells, multiple cell types and usually have more than one parameter as a readout. The complexity of the latter workflow makes it better suited to being performed on a smaller representative subset of the large collection.

Phenotypic screening historically has been the basis for the discovery of many drugs. Compounds are screened in cellular or animal disease models to identify compounds that cause a desirable change in phenotype. Only after the compounds have been discovered are efforts made to determine the biological targets of the compounds - a process known as target deconvolution.

Proposals for phenotypic screening approaches follow the normal review and selection process. A dedicated application form is available here.

The submission deadline for the next review and selection round is February 7, 2020.

EFMC Prize for a Young Medicinal Chemist in Industry/Academia

I just thought I'd highlight this award.

The EFMC created the “EFMC Prize for a Young Medicinal Chemist in Industry/Academia” as we felt it was important to acknowledge and recognise outstanding young medicinal chemists (≤ 12 years after PhD) working in European industry and academia. The 2020 Prizes will be given at the XXVI EFMC "International Symposium on Medicinal Chemistry" (EFMC-ISMC 2020) to be held in Basel, Switzerland on September 6-10, 2020. Both prizes consist of a diploma, an invitation to give an oral communication at the EFMC-ISMC, and a cash prize of € 1,000.

To find out more on the regulations and the application procedure visit the EFMC website:, closing date Jan 31 2020.

Drugs approved by EMA in 2019

I recently posted details of the small molecule drugs approved by the FDA in 2019. This generated considerable interest and I thought it might worthwhile doing a similar thing for the drug approvals in Europe. However this turns out to be less straight-forward, medicines can be authorised in several European countries simultaneously by using one of three procedures: the 'centralised procedure', the 'mutual-recognition procedure' or the 'decentralised procedure'. Medicines can also be authorised in a single Member State by using the national authorisation procedure of that country. The European Medicines Agency is responsible for the centralised procedure so I downloaded just the drugs approved via this mechanism.

Of the 61 approvals in 2019, 45 were small molecule drugs and 16 were biologics. The structures of the small molecules are shown below


Looking at the calculated physiochemical properties of the small molecules one thing is quite interesting, around 50% are predicted to be ionised at physiological pH.


As shown in the plot below (Blue = small molecules, Green = Biologics) the largest group of drugs were Antineoplastic agents, the next largest groups being anti-virals and immunosuppressants.



Four "biosimilars" were also approved. Kromeya and idacio both of which contain adalimumab (Humira) a TNF-alpha inhibitor as the active ingredient. Adalimumab was the first fully human monoclonal antibody approved by the FDA in 2002.

Grasustek contains the active substance pegfilgrastim (Neulasta) a PEGylated form of the recombinant human granulocyte colony-stimulating factor (GCSF) analog filgrastim. Zirabev contains the active substance bevacizumab (Avastin) that blocks angiogenesis by inhibiting vascular endothelial growth factor A (VEGF-A).

These four drugs join a number of other biosimilars approved in Europe, with the UK in particular keen to move to biosimilars. Biosimilars are expected to save the EU up to $44 billion in health care costs by 2020 LINK.

Most important, the EU is realizing the benefits of biosimilars without sacrificing safety or quality. Of the biosimilars approved since 2006, none have been withdrawn or suspended for safety or efficacy reasons. Further, regulators have not identified any differences in the nature, severity or frequency of adverse effects between biosimilars and biologics.

Fragment based screening pages updated

I spent some time over the Christmas break updating the Drug Discovery Resources pages on Fragment-Based screening, adding new vendors and updating the physicochemical profiles. I've also added some discussion on the elaboration/optimisation of fragments.

The pages are

Fragment-Based Screening
Building a Fragment Collection
Available Fragment Collections
Profiles of Fragment Collections
Fragment-Based Screening Published Hits

The published fragments contains details of fragments that have been reported as hits in the literature, this database now has over 1500 entries culled from over 310 publications directed at nearly 220 different molecular targets using 26 different detection technologies.


It could be argued that published fragment hits perhaps gives us an insight into the best fragments to include in library design.

Small molecules approved by FDA in 2019

Drug approvals from the FDA in 2019 a total of 48 with the "small" molecules shown below.


Calculated physicochemical properties for the individual components (I guess some are not so small).