Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Combibench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Auto-formalizationCombiBench
Pass@897
13
Statement generationCombiBench N = 100
CH@1001
11
Autoformalization and ProvingCombiBench (N=100)
Pass@6496
4
Formal Theorem ProvingCombibench
Solve Rate48
2
Showing 4 of 4 rows