ClawBio Skill Correctness Bench

Third-party (Biostochastics LLC) benchmark of bio-analysis skills on safety / correctness / honesty. 10 skills × 182 tests.

Composite
74.2
Experimental validation
Retrospective
Stages
Disease ModelingTarget IDClinical Development
Modalities
cross-modality
Task types
correctness-auditsafety-audit
Size
skills: 10
tests: 182
pass_rate_pct: 92.3
License
MIT
First release
2026-04
Last updated
2026-05-03
Official site
→ project page
Leaderboard
→ leaderboard
Dataset
→ dataset
Code / GitHub
→ repository
HuggingFace
→ HF
Paper
clawbio_bench README (v0.1.5) · Biostochastics LLC · 2026 · paper · doi:N/A — repo · 5 citations
Flags
none
Experts
Groups
ClawBio, Biostochastics LLC
Hosted by
ClawBio Benchmarks
Related benchmarks

Rubric (7-criterion)

rigor
5
coverage
2
maintenance
5
adoption
2
quality
4
accessibility
5
industry_relevance
3

Notes

Independent third-party bench structurally precludes self-reference. Coverage narrow but rigor exemplary.

← Back to all benchmarks

Compare:
Open comparison →