Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-image visual reasoning on BLINK
Loading...
60.1
Accuracy
VAPO
26.612
35.306
44
52.694
Sep 30, 2025
Nov 1, 2025
Dec 4, 2025
Jan 6, 2026
Feb 7, 2026
Mar 12, 2026
Apr 14, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
VAPO
2025.09
60.1
GRPO
2025.09
57
Base model
2025.09
55.3
VIF
LLM=Vicuna-7B, Resolut...
2026.04
40.5
LLaVA-v1.5
LLM=Vicuna-7B, Resolut...
2026.04
39.7
IDEFICS-9B
LLM=LLaMA 2-7B
2026.04
38.3
Mantis-8B-Fuyu
LLM=Fuyu-8B, Resolutio...
2026.04
38.2
Qwen-VL-Chat
LLM=Qwen-7B, Resolutio...
2026.04
28.2
Qwen-VL
LLM=Qwen-7B, Resolutio...
2026.04
27.9
Feedback
Search any
task
Search any
task