Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Theorem Proving on PutnamBench (test)
Loading...
72
Accuracy
Hilbert
-2.88
16.56
36
55.44
Oct 14, 2025
Accuracy
Solved Problems
Updated 9d ago
Evaluation Results
Method
Method
Links
Accuracy
Solved Problems
Hilbert
Compute=avg. pass@1840...
2025.10
72
462
Seed-Prover
Compute=medium, Open-s...
2025.10
51
329
Ax-Prover
Compute=pass@1‡, Open-...
2025.10
14
92
Goedel-Prover-V2
Compute=pass@184, Open...
2025.10
13
86
DeepSeek-Prover-V2
Compute=pass@1024, Ope...
2025.10
7
47
DSP+
Compute=pass@128, Open...
2025.10
4
23
Bourbaki
Compute=pass@512, Open...
2025.10
2
14
Kimina-Prover-7B-Distill
Compute=pass@192, Open...
2025.10
2
10
Self-play Theorem Prover
Compute=pass@3200, Ope...
2025.10
1
8
Goedel-Prover-SFT
Compute=pass@512, Open...
2025.10
1
7
Gemini-2.5-Pro
Compute=pass@1, Open-s...
2025.10
0.5
3
GPT-4o
Compute=pass@10, Open-...
2025.10
0.2
1
Claude-3.7-Sonnet
Compute=pass@1, Open-s...
2025.10
0
0
Feedback
Search any
task
Search any
task