Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Combibench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Formal Theorem ProvingCombibench
Solve Rate48
15
Theorem ProvingCombiBench
Proof Length8
13
Theorem ProvingCombiBench Combinatorics
Solved Problems27
13
Auto-formalizationCombiBench
Pass@897
13
Statement generationCombiBench N = 100
CH@1001
11
Theorem ProvingCombiBench
pass@3216
8
Automated Theorem ProvingCombiBench Easy Mode
Solved Problems (Pass@32)10
4
Autoformalization and ProvingCombiBench (N=100)
Pass@6496
4
Automated Theorem ProvingCombiBench Hard Mode
Total Solved (Pass@32)10
3
Showing 9 of 9 rows