ProteinGym
217 DMS substitution assays + indel + clinical variants — de facto standard for VEPs.
Composite
97.5
Experimental validation
Wet-lab confirmed
Stages
Target IDLead ID / ADMETIND-enabling
Modalities
protein-general
Task types
variant-effect
Size
dms_assays: 217
mutations: 2,700,000
clinical_variants: 2,525
mutations: 2,700,000
clinical_variants: 2,525
License
MIT
First release
2022
Last updated
2025-03
Official site
Leaderboard
Dataset
Code / GitHub
HuggingFace
→ HF
Paper
ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design · Notin P, Kollasch A, Ritter D, et al. · 2023 · paper · doi:10.48550/arXiv.2305.06259 · 320 citations
Flags
none
Experts
Groups
Hosted by
Related benchmarks
Rubric (7-criterion)
rigor
5
coverage
5
maintenance
5
adoption
5
quality
5
accessibility
5
industry_relevance
4
Notes
Field standard. Clinical track enables fair ESM/EVE/AlphaMissense comparison.