Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Formal Theorem Proving on FormalML Hard
Loading...
11.2
Proof Length
GPT-5.3-Codex
10.06
17.755
25.45
33.145
Apr 29, 2026
Proof Length
Updated 1mo ago
Evaluation Results
Method
Method
Links
Proof Length
GPT-5.3-Codex
Method Category=Propri...
2026.04
11.2
DreamProver (GPT-5.3-Codex)
Method Category=Lemma...
2026.04
23.3
DreamProver (Gemini 2.5 Pro)
Method Category=Lemma...
2026.04
27.5
DreamProver (Gemini 3.1 Pro)
Method Category=Lemma...
2026.04
31.6
Gemini 2.5 Pro
Method Category=Propri...
2026.04
32.8
Gemini 3.1 Pro
Method Category=Propri...
2026.04
39.7
Feedback
Search any
task
Search any
task