Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Automated Theorem Proving on FormalML-Hard (Machine Learning Theory) 1.0 (test)
Loading...
0.4
Output Tokens (k)
DreamProver
0.088
0.169
0.25
0.331
Apr 29, 2026
Output Tokens (k)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Output Tokens (k)
DreamProver
System Category=Lemma...
2026.04
0.4
DreamProver
System Category=Lemma...
2026.04
0.36
Gemini 2.5 Pro
System Category=Propri...
2026.04
0.29
Gemini 3.1 Pro
System Category=Propri...
2026.04
0.28
DreamProver
System Category=Lemma...
2026.04
0.23
GPT-5.3-Codex
System Category=Propri...
2026.04
0.1
Feedback
Search any
task
Search any
task