Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multidisciplinary Reasoning on MMMU Pro 4
Loading...
62.03
Accuracy@1
Qwen3-VL-8B-Instruct (Teacher)
33.4092
40.8396
48.27
55.7004
May 30, 2026
Accuracy@1
Accuracy@16
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy@1
Accuracy@16
Qwen3-VL-8B-Instruct (Teacher)
Model Size=8B, Decodin...
2026.05
62.03
-
VGS On-Policy Distillation
Student Model=Qwen3-VL...
2026.05
56.86
56.86
Standard On-Policy Distillation
Student Model=Qwen3-VL...
2026.05
55.79
56.43
Qwen3-VL-4B-Instruct (Initial Student)
Model Size=4B, Decodin...
2026.05
48.93
-
VGS On-Policy Distillation
Student Model=Qwen3-VL...
2026.05
48.07
48.34
Standard On-Policy Distillation
Student Model=Qwen3-VL...
2026.05
45.83
47.33
Qwen3-VL-2B-Instruct (Initial Student)
Model Size=2B, Decodin...
2026.05
34.51
-
Feedback
Search any
task
Search any
task