Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Formal Theorem Proving on Number Theory
Loading...
2.51
PutnamBench
Hilbert
0.0868
0.7159
1.345
1.9741
Apr 29, 2026
PutnamBench
ProverBench
Updated 1mo ago
Evaluation Results
Method
Method
Links
PutnamBench
ProverBench
Hilbert
Backbone=Gemini 2.5 Pr...
2026.04
2.51
2.19
Goedel-Prover-V2-8B
Type=Open-source LLM
2026.04
2.1
1.72
Hilbert
Backbone=Gemini 3.1 Pr...
2026.04
1.87
1.52
Goedel-Prover-V2-32B
Type=Open-source LLM
2026.04
1.62
1.27
Hilbert
Backbone=GPT-5.3-Codex...
2026.04
1.2
1.07
DreamProver
Backbone=Gemini 3.1 Pr...
2026.04
1.15
0.64
DreamProver
Backbone=Gemini 2.5 Pr...
2026.04
1.13
0.54
Gemini 2.5 Pro
Type=Proprietary LLM
2026.04
0.96
0.61
DreamProver
Backbone=GPT-5.3-Codex...
2026.04
0.89
0.27
Gemini 3.1 Pro
Type=Proprietary LLM
2026.04
0.59
0.6
DeepSeek-Prover-V2-7B
Type=Open-source LLM
2026.04
0.58
0.5
Claude 4.6 Opus
Type=Proprietary LLM
2026.04
0.51
0.4
GPT-5.3-Codex
Type=Proprietary LLM
2026.04
0.18
0.13
Feedback
Search any
task
Search any
task