Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Embodied reasoning on ERQA (test)
Loading...
70.25
Accuracy
R4
34.11
43.4925
52.875
62.2575
Dec 17, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
R4
Backbone=Gemma3-4B-IT,...
2025.12
70.25
GPT-5
2025.12
65.7
o3
2025.12
64
Qwen3 VL 235B A22B Thinking
Model Size=235B A22B
2025.12
52.5
Qwen3 VL 32B Thinking
Model Size=32B
2025.12
52.3
Gemini 2.0 Pro Experimental
2025.12
48.3
GPT-4o
2025.12
47
Gemini 2.0 Flash
2025.12
46.3
Gemini 1.5 Flash
2025.12
42.3
Gemini 1.5 Pro
2025.12
41.8
GPT-4o-mini
2025.12
37.3
Claude 3.5 Sonnet
2025.12
35.5
Feedback
Search any
task
Search any
task