| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Social Dialogue | SOTOPIA Interaction with GPT-4o | Goal Score8.21 | 28 | |
| Social Dialogue | SOTOPIA Self-Chat | GOAL8.56 | 28 | |
| Social Dialogue | SOTOPIA Overall (AVG) | AVG Score5.63 | 11 | |
| Social Dialogue | SOTOPIA Interaction with GPT-4o-mini | GOAL Score7.53 | 11 | |
| Social Interaction | SOTOPIA all social scenarios | Goal Score7.62 | 5 | |
| Social Intelligence Assessment | SOTOPIA hard episodes (test) | SOC Score-0.02 | 4 |