ClawBio Benchmarks
Public scientific-correctness leaderboard for bio-analysis skills. Independent third-party benchmark (clawbio_bench, authored by Biostochastics LLC) tests ClawBio skills on safety, correctness, honesty. Public failure surface with remediation tasks.
Kind
meta-platform
Composite
74.2
Benchmarks tracked
10
Direct-linked
0
As of
2026-05-03
Host organisation
ClawBio (open source, MIT)
Primary contacts
ClawBio maintainers, Biostochastics LLC (bench author)
Founded
2026-04
License model
MIT
Official site
GitHub
Count methodology
Scraped https://clawbio.ai/benchmarks.html on 2026-05-12; last bench run 2026-05-03 against ClawBio commit 7820473 using clawbio_bench v0.1.5. 10 skills audited: claw-metagenomics, equity-scorer, nutrigx-advisor, bio-orchestrator, pharmgx-reporter, fine-mapping, clinical-variant-reporter, cvr-acmg-correctness, gwas-prs, cvr-variant-identity. 168/182 tests passing (92.3%).
Rubric
rigor
5
coverage
2
maintenance
5
adoption
2
quality
4
accessibility
5
industry_relevance
3
Breakdown
skills_audited
10
tests_total
182
tests_passing
168
pass_rate_pct
92.3
Notes
Independent third-party bench in a separate repo — structurally NOT self-referential. Coverage narrow (bio-analysis skills) but rigor is exemplary (safety × correctness × honesty tri-dimensional). Model for how skill/agent correctness should be audited.