Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

mathlib

Benchmarks

Task NameDataset NameSOTA ResultTrend
Formal Theorem Provingmathlib (val)
Pass@162.6
9
Proof Optimization (Length)Mathlib
Improvement6.19
4
Formal Theorem Provingmathlib (test)
Pass@163
3
Proof Optimization (Declarativity)Mathlib
Improvement4.63
2
Showing 4 of 4 rows