Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-image Visual Question Answering on MMXU (test)
Loading...
28.9
Worsen Rate
Med-Flamingo
27.54
36.72
45.9
55.08
Feb 17, 2025
Worsen Rate
Improved Rate
No Change Rate
Overall Score
Updated 3mo ago
Evaluation Results
Method
Method
Links
Worsen Rate
Improved Rate
No Change Rate
Overall Score
Med-Flamingo
Size=7B
2025.02
28.9
30.4
25.1
28.1
Llava-med
Size=7B
2025.02
29.3
33.4
32.3
31.7
HuatuoGPT-Vision
Size=7B
2025.02
62.9
50.2
35.4
49.5
Feedback
Search any
task
Search any
task