Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Relational Reasoning on VSR
Loading...
85.7
Accuracy
QWEN2.5-VL-7B + PGT
59.388
66.219
73.05
79.881
May 22, 2026
Accuracy
Updated 9d ago
Evaluation Results
Method
Method
Links
Accuracy
QWEN2.5-VL-7B + PGT
Backbone=QWEN2.5-VL-7B...
2026.05
85.7
QWEN2.5-VL-7B + SPECIALIZED MIX
Backbone=QWEN2.5-VL-7B...
2026.05
85.5
IMAGE JIGSAW
Method=IMAGE JIGSAW
2026.05
85.4
INTERNVL3-8B + PGT
Backbone=INTERNVL3-8B,...
2026.05
85.3
INTERNVL3-8B
Backbone=INTERNVL3-8B,...
2026.05
85.2
QWEN2.5-VL-3B + PGT
Backbone=QWEN2.5-VL-3B...
2026.05
84
QWEN2.5-VL-7B
Backbone=QWEN2.5-VL-7B...
2026.05
83.8
THINKLITE-VL
Method=THINKLITE-VL
2026.05
83.3
QWEN2.5-VL-3B + SPECIALIZED MIX
Backbone=QWEN2.5-VL-3B...
2026.05
82.8
QWEN2.5-VL-3B
Backbone=QWEN2.5-VL-3B...
2026.05
80.4
VIGORL-3B
Method=VIGORL-3B
2026.05
74.1
LLAVA-NEXT-LLAMA3-8B
Backbone=LLAVA-NEXT-LL...
2026.05
71.8
LLAVA-NEXT-LLAMA3-8B + PGT
Backbone=LLAVA-NEXT-LL...
2026.05
71.8
LLAVA-NEXT-7B + PGT
Backbone=LLAVA-NEXT-7B...
2026.05
70.9
LLAVA-NEXT-7B
Backbone=LLAVA-NEXT-7B...
2026.05
64.7
SPATIAL-LADDER-3B
Method=SPATIAL-LADDER-3B
2026.05
60.4
Feedback
Search any
task
Search any
task