SOTA Multimodal Conversation benchmarks and papers with code

Benchmarks

Dataset Name	SOTA Method	Metric
LLaVA-Bench Wild		Score102	78	1mo ago
LLaVA Bench	GPT-4V (2023.11.06)	LLaVA Bench Score93.1	46	2mo ago
PCogAlignBench (LS2)	GRPO (Latent Action)	LLM Judge Score0.852	20	5mo ago
PCogAlignBench LS1	DAPO (Latent Action)	LLM Judge Score0.903	20	5mo ago
MMRole OOD	Dr.GRPO (Latent Action)	LLM-as-a-Judge Score91.6	20	5mo ago
MMRole (ID)	Dr.GRPO (Latent Action)	LLM-as-a-Judge Score95.3	20	5mo ago
Multimodality Chatbot Arena	LLaMA-Adapter v2	Elo Rating1,023	8	5mo ago
LLaVA-Bench In-the-Wild v1 (test)	ShareGPT4V-7B	General Score0.726	7	5mo ago

Showing 8 of 8 rows