| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Conversation | MMRole OOD | LLM-as-a-Judge Score91.6 | 20 | |
| Multimodal Conversation | MMRole (ID) | LLM-as-a-Judge Score95.3 | 20 | |
| Multimodal Role-playing | MMRole (out-of-domain) | IA1.164 | 17 | |
| Multi-modal Role-playing | MMRole in-domain | IA1.199 | 17 |