| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| VQA | RoleScape-20 | VQA Accuracy84 | 3 | |
| Knowledge QA | RoleScape-20 | Knowledge QA Accuracy77 | 3 | |
| T2I | RoleScape-20 | CLIP Similarity (I)0.88 | 3 | |
| Text-based Role Play | RoleScape-20 | Memorization5.45 | 3 | |
| Multimodal Role-Play (T2T2I) | RoleScape-20 | CLIP-I Score0.86 | 2 |