Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General Multi-image Reasoning and Generalization on LLaVA-Interleave Bench Out-domain

57.8Average Score

GPT-4V

28.26435.93243.651.268Jul 10, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.07
57.860.366.962.751.147.9-
2024.07
44.333.432.766.452.137.1-
2024.07
42.832.831.662.752.634.5-
2024.07
39.327.229.359.546.434.1-
2024.07
34.524.323.152.443.129.4-
2024.07
33.113.312.245.639.228.6-
2024.07
29.413.512.246.141.833.5-
2024.08
------33.3
2024.08
------64.2
2024.08
------79.9
2024.08
------60.3