Share your thoughts, 1 month free Claude Pro on usSee more

Proof writing on IMO-ProofBench

58.7Avg@3 Grade Score

Gemini 3 Pro

Updated 3mo ago

Evaluation Results

Method	Links
Gemini 3 Pro 2026.04		58.7
DeepSeek-Math-V2 2026.04		57.9
QED-Nano (+ RSA test-time scaffold) 2026.04		56.9
GPT-OSS-120B 2026.04		43.1
Nomos-1 2026.04		40.3
QED-Nano 2026.04		40
QED-Nano (SFT initialization only) 2026.04		39.5
GPT-OSS-20B 2026.04		38.3
Qwen3-235B-A22B-Thinking-2507 2026.04		34.1
Qwen3-30B-A3B-Thinking-2507 2026.04		27.6
Qwen3-4B-Thinking-2507 2026.04		20.4