D3R Grand Challenge 2: blind prediction of protein–ligand poses, aﬃnity rankings, and relative binding free energies

05 12 17- Filed In:Docking | hit identification | virtual screening

The Drug Design Data Resource (D3R) is an NIH-funded resource dedicated to improving method development in ligand docking and scoring through community-wide blinded prediction challenges (http://www.drugdesigndata.org). DOI

The Drug Design Data Resource (D3R) ran Grand Challenge 2 (GC2) from September 2016 through February 2017. This challenge was based on a dataset of structures and aﬃnities for the nuclear receptor farnesoid X receptor (FXR), contributed by F. Hoﬀmann-La Roche. The dataset contained 102 IC50 values, spanning six orders of magnitude, and 36 high-resolution co-crystal structures with representatives of four major ligand classes. Strong global participation was evident, with 49 participants submitting 262 prediction submission packages in total. Procedurally, GC2 mimicked Grand Challenge 2015 (GC2015), with a Stage 1 sub-challenge testing ligand pose prediction methods and ranking and scoring methods, and a Stage 2 sub-challenge testing only ligand ranking and scoring methods after the release of all blinded co-crystal structures. Two smaller curated sets of 18 and 15 ligands were developed to test alchemical free energy methods. This overview summarises all aspects of GC2, including the dataset details, challenge procedures, and participant results. We also consider implications for progress in the ﬁeld, while highlighting methodological areas that merit continued development. Similar to GC2015, the outcome of GC2 underscores the pressing need for methods development in pose prediction, particularly for ligand scaﬀolds not currently represented in the Protein Data Bank (http://www.pdb.org), and in aﬃnity ranking and scoring of bound ligands.

Conclusions:

Successful prediction of ligand–protein poses depends on the entire workﬂow, including factors extrinsic to the core docking algorithm, such as the conformation of the protein selected.
The accuracy of pose predictions tends to be improved by the use of available structural data, via ligand overlays and/or selection of receptor structures solved with similar ligands.
The accuracy of the poses used in structure-based aﬃnity rankings does not clearly correlate with ranking accuracy.
Explicit solvent free energy methods did not, overall, pro-vide greater accuracy than faster, less detailed scoring methods