Share your thoughts, 1 month free Claude Pro on usSee more

General Multi-image Reasoning and Generalization on LLaVA-Interleave Bench Out-domain

57.8Average Score

GPT-4V

Updated 5mo ago

Evaluation Results

Method	Links
GPT-4V 2024.07		57.8	60.3	66.9	62.7	51.1	47.9	-
LLaVA-NeXT-Interleave 2024.07		44.3	33.4	32.7	66.4	52.1	37.1	-
LLaVA-NeXT-Interleave 2024.07		42.8	32.8	31.6	62.7	52.6	34.5	-
Mantis 2024.07		39.3	27.2	29.3	59.5	46.4	34.1	-
VPG-C 2024.07		34.5	24.3	23.1	52.4	43.1	29.4	-
LLaVA-NeXT-Interleave 2024.07		33.1	13.3	12.2	45.6	39.2	28.6	-
LLaVA-NeXT-Image 2024.07		29.4	13.5	12.2	46.1	41.8	33.5	-
LLaVA-OneVision-0.5B 2024.08		-	-	-	-	-	-	33.3
LLaVA-OneVision-7B 2024.08		-	-	-	-	-	-	64.2
LLaVA-OneVision-72B 2024.08		-	-	-	-	-	-	79.9
GPT-4V 2024.08		-	-	-	-	-	-	60.3