USPTO-50K / USPTO-MIT (Retrosynthesis)
Reactions extracted from USPTO patents; standard retrosynthesis/forward-reaction benchmark.
Composite
78.0
Experimental validation
Retrospective
Stages
Lead ID / ADMETDevelopmental Candidate
Modalities
small-molecule
Task types
retrosynthesisreaction-prediction
Size
reactions: 1,800,000
canonical_50k: 50,037
canonical_50k: 50,037
License
Public
First release
2017
Last updated
2023
Official site
Leaderboard
→ leaderboard
Dataset
Code / GitHub
HuggingFace
→ HF
Paper
Neural Sequence-to-Sequence Models for Retrosynthesis Prediction · Liu B, Ramsundar B, Kawthekar P, et al. · 2017 · paper · doi:10.1021/acscentsci.7b00303 · 520 citations
Flags
data-leakage-known
Experts
Groups
Hosted by
Related benchmarks
—
Rubric (7-criterion)
rigor
4
coverage
4
maintenance
2
adoption
5
quality
3
accessibility
5
industry_relevance
4
Notes
Known leakage across canonical splits; use time-split or ORD for fairer eval.