Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Theorem-proving on MathOlympiad-Bench
Loading...
46.7
Pass@32
LongCat-Flash-Prover
2.188
13.744
25.3
36.856
Mar 22, 2026
Pass@32
Updated 25d ago
Evaluation Results
Method
Method
Links
Pass@32
LongCat-Flash-Prover
Mode=sketch-proof, Sea...
2026.03
46.7
LongCat-Flash-Prover
Mode=sketch-proof, Sea...
2026.03
42.5
LongCat-Flash-Prover
Evaluation Mode=sketch...
2026.03
35.8
LongCat-Flash-Prover
Evaluation Mode=whole-...
2026.03
27.5
Goedel-Prover-V2-32B
Model Category=Open-We...
2026.03
20.3
LongCat-Flash-Prover
Evaluation Mode=whole-...
2026.03
16.9
Goedel-Prover-V2-32B
Model Category=Open-We...
2026.03
16.7
Goedel-Prover-V2-32B
Budget (b)=32
2026.03
16.7
DeepSeek-V3.2
Model Category=Open-We...
2026.03
14.7
DeepSeek-Prover-V2-671B
Model Category=Open-We...
2026.03
13.9
Kimina-Prover-72B
Model Category=Open-We...
2026.03
13.1
DeepSeek-Prover-V2-7B
Model Category=Open-We...
2026.03
11.1
Goedel-Prover-V2-8B
Model Category=Open-We...
2026.03
11.1
Kimina-Prover-8B
Model Category=Open-We...
2026.03
8.6
Kimi-K2.5
Model Category=Open-We...
2026.03
7.5
Gemini-3 Pro
Model Category=Close-W...
2026.03
3.9
Feedback
Search any
task
Search any
task