Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real-world Visual Understanding on RealWorldQA (test)
Loading...
68.9
Final Performance
Full RT
60.164
62.432
64.7
66.968
Apr 6, 2026
Final Performance
MV 2B(S)/8B(S) Score
MV 2B(R)/8B(S) Score
Runtime (RT)
Updated 11d ago
Evaluation Results
Method
Method
Links
Final Performance
MV 2B(S)/8B(S) Score
MV 2B(R)/8B(S) Score
Runtime (RT)
Full RT
Backbone=Qwen3-VL-8B+2B-R
2026.04
68.9
-
-
-
MAI
Backbone=Qwen3-VL-2B/8B
2026.04
68.4
509
105
151
8B(S) Baseline
Backbone=Qwen3-VL-8B,...
2026.04
67.6
-
-
-
2B(S) Baseline
Backbone=Qwen3-VL-2B,...
2026.04
60.7
-
-
-
2B(R) Baseline
Backbone=Qwen3-VL-2B,...
2026.04
60.5
-
-
-
Feedback
Search any
task
Search any
task