Share your thoughts, 1 month free Claude Pro on usSee more

Multi-image reasoning on QBench2 (val)

79.3Accuracy

MM1.5-30B

Updated 5mo ago

Evaluation Results

Method	Links
MM1.5-30B 2024.09		79.3
LLaVA-NeXT-Interleave-14B 2024.09		76.7
GPT-4V 2024.09		76.5
Mantis-Idefics2-8B 2024.09		75.2
BLIP-3-4B 2024.09		75.1
LLaVA-NeXT-Interleave-7B 2024.09		74.2
MM1.5-3B-MoE 2024.09		73.8
MM1.5-3B 2024.09		73.2
MM1.5-7B 2024.09		73.2
MM1.5-1B-MoE 2024.09		70.9
MM1.5-1B 2024.09		66.4
Idefics2-8B 2024.09		57
Phi-3-Vision-4B 2024.09		56.8
LLaVA-NeXT-Interleave-0.5B 2024.09		52
Emu2-Chat-37B 2024.09		50.1
LLaVA-v1.5-7B 2024.09		49.3
LLaVAOneVision-0.5B 2024.09		48.8
MM1-7B 2024.09		43.6
MM1-1B 2024.09		43.4
MM1-30B 2024.09		42.8
MM1-3B 2024.09		41.4