Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Formal Theorem Proving on PutnamBench September 2025
Loading...
462
Solved Problems Count
HILBERT
-11.2
111.65
234.5
357.35
Sep 26, 2025
Solved Problems Count
Solved Problems (%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Solved Problems Count
Solved Problems (%)
HILBERT
llm=Gemini 2.5 Pro, pr...
2025.09
462
70
SeedProver
2025.09
331
50.4
HILBERT
llm=gpt-oss-120b, prov...
2025.09
88
13.3
Goedel-Prover-V2-32B
self-correction=true,...
2025.09
86
13.4
DeepSeek-Prover-V2 671B
pass@k=1024
2025.09
47
7.1
Bourbaki
pass@k=512
2025.09
26
4
DSP+
pass@k=128
2025.09
23
3.6
Kimina-Prover-7B-Distill
pass@k=192
2025.09
10
1.5
Self-play Theorem Prover
pass@k=3200
2025.09
8
1.2
Goedel-Prover-SFT
pass@k=512
2025.09
7
1.1
ABEL
pass@k=596
2025.09
7
1.1
Feedback
Search any
task
Search any
task