Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multimodal Autoformalization on ANALYTIC GEOMETRY Solid Geometry
Loading...
1
Compile Rate
GPT-5
-0.04
0.23
0.5
0.77
Jan 6, 2026
Compile Rate
Semantic Correctness
Human Check Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Compile Rate
Semantic Correctness
Human Check Score
GPT-5
Modality=Text
2026.01
1
1
-
Gemini-3-Pro
Modality=Image
2026.01
0.6
0.4
0.4
Gemini-2.5-Pro
Modality=Image
2026.01
0.6
0.4
0.6
Gemini-3-Pro
Modality=Text
2026.01
0.4
0.2
-
GPT-5
Modality=Image
2026.01
0.2
0.2
0
Qwen3-VL-235B
Modality=Image
2026.01
0.2
0.2
-
Gemini-2.5-Pro
Modality=Text
2026.01
0
0
-
Qwen3-VL-235B
Modality=Text
2026.01
0
0
-
Qwen2.5-VL-72B
Modality=Image
2026.01
0
0
0
Qwen2.5-VL-72B
Modality=Text
2026.01
0
0
-
Feedback
Search any
task
Search any
task