ChEMBL

Manually curated bioactive molecule DB; backbone for most ML chemistry benchmarks.

Composite
97.5
Experimental validation
Wet-lab confirmed
Stages
Hit IDLead ID / ADMET
Modalities
small-moleculeprotein-general
Task types
bioactivitydata-resource
Size
compounds: 2,400,000
activities: 20,700,000
targets: 15,398
License
CC-BY-SA 3.0
First release
2009
Last updated
2025-05
Official site
→ project page
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
The ChEMBL Database in 2023 · Zdrazil B, Felix E, Hunter F, et al. · 2024 · paper · doi:10.1093/nar/gkad1004 · 800 citations
Flags
none
Experts
Andrew Leach, Barbara Zdrazil
Groups
EMBL-EBI
Hosted by
ELIXIR Infrastructure
Related benchmarks
TDC ADMET Group, MoleculeNet, PubChem BioAssay

Rubric (7-criterion)

rigor
5
coverage
5
maintenance
5
adoption
5
quality
4
accessibility
5
industry_relevance
5

Notes

Underlies ~80% of public bioactivity ML benchmarks.

← Back to all benchmarks

Compare:
Open comparison →