Subscribe in a reader

Cambridge MedChem Consulting

AlphaFold Protein Structure Database

The AlphaFold Protein Structure Database Developed by DeepMind and EMBL-EBI is now available online.

AlphaFold DB provides open access to protein structure predictions for the human proteome and 20 other key organisms to accelerate scientific research.

AlphaFold DB currently provides predicted structures for the organisms listed below and includes human, laboratory species, and key pathogens. All the predictions for all the species can be downloaded from the EBI FTP site

Species Common Name Reference Proteome Predicted Structures Download
Arabidopsis thaliana Arabidopsis UP000006548 27,434 Download (3642 MB)
Caenorhabditis elegans Nematode worm UP000001940 19,694 Download (2601 MB)
Candida albicans C. albicans UP000000559 5,974 Download (965 MB)
Danio rerio Zebrafish UP000000437 24,664 Download (4141 MB)
Dictyostelium discoideum Dictyostelium UP000002195 12,622 Download (2150 MB)
Drosophila melanogaster Fruit fly UP000000803 13,458 Download (2174 MB)
Escherichia coli E. coli UP000000625 4,363 Download (448 MB)
Glycine max Soybean UP000008827 55,799 Download (7142 MB)
Homo sapiens Human UP000005640 23,391 Download (4784 MB)
Leishmania infantum L. infantum UP000008153 7,924 Download (1481 MB)
Methanocaldococcus jannaschii M. jannaschii UP000000805 1,773 Download (171 MB)
Mus musculus Mouse UP000000589 21,615 Download (3547 MB)
Mycobacterium tuberculosis M. tuberculosis UP000001584 3,988 Download (421 MB)
Oryza sativa Asian rice UP000059680 43,649 Download (4416 MB)
Plasmodium falciparum P. falciparum UP000001450 5,187 Download (1132 MB)
Rattus norvegicus Rat UP000002494 21,272 Download (3404 MB)
Saccharomyces cerevisiae Budding yeast UP000002311 6,040 Download (960 MB)
Schizosaccharomyces pombe Fission yeast UP000002485 5,128 Download (776 MB)
Staphylococcus aureus S. aureus UP000008816 2,888 Download (268 MB)
Trypanosoma cruzi T. cruzi UP000002296 19,036 Download (2905 MB)
Zea mays Maize UP000007305 39,299 Download (5014 MB)

The search bar at the top of the query page accepts queries based on protein name, gene name, UniProt identifier, or organism name. At present you can't search using a sequence and look for similar proteins. You would first need to do a BLAST search and use the results from that as queries.

Here I searched for Plasmodium falciparum carbonic anhydrase (Q8IHW5) a potential Malaria target. As you can see there is no crystal structure in the PDB. Whilst the active site is predicted with high confidence there are clearly regions for which there is very low confidence.


You can then download the structure in PDB or mmCIF format.

I made a homology model (in purple below) of this protein a while back and it has little sequence similarity with any proteins in the PDB. Despite not including a Zinc the Alphafold Predicted Structure includes histidines in positions to potentially coordinate to the Zinc. If it is possible to include the Zinc in the structure prediction I'd be interested in finding out.


Overall I'd say this is a very useful starting point.