ProteinGym

217 DMS substitution assays + indel + clinical variants — de facto standard for VEPs.

Composite
97.5
Experimental validation
Wet-lab confirmed
Stages
Target IDLead ID / ADMETIND-enabling
Modalities
protein-general
Task types
variant-effect
Size
dms_assays: 217
mutations: 2,700,000
clinical_variants: 2,525
License
MIT
First release
2022
Last updated
2025-03
Official site
→ project page
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
ProteinGym: Large-Scale Benchmarks for Protein Fitness Prediction and Design · Notin P, Kollasch A, Ritter D, et al. · 2023 · paper · doi:10.48550/arXiv.2305.06259 · 320 citations
Flags
none
Experts
Debora Marks, Pascal Notin, Yarin Gal
Groups
Marks Lab (Debora Marks), Oxford OATML
Hosted by
ProteinGym
Related benchmarks
FLIP

Rubric (7-criterion)

rigor
5
coverage
5
maintenance
5
adoption
5
quality
5
accessibility
5
industry_relevance
4

Notes

Field standard. Clinical track enables fair ESM/EVE/AlphaMissense comparison.

← Back to all benchmarks

Compare:
Open comparison →