| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Machine Reading Comprehension | DREAM (test) | Accuracy91.8 | 23 | |
| Deepfake video photorealism assessment | DREAM Overall Average | PLCC (Overall Average)0.976 | 22 | |
| Deepfake video photorealism assessment | DREAM (test-3) | PLCC0.975 | 22 | |
| Deepfake video photorealism assessment | DREAM (test-2) | PLCC0.976 | 22 | |
| Deepfake video photorealism assessment | DREAM (Test-1) | PLCC97.7 | 22 | |
| Multiple Choice Question Answering | DREAM | Accuracy98.77 | 22 | |
| Machine Reading Comprehension | DREAM (dev) | Accuracy90 | 21 | |
| Dialogue-based Multiple-choice Question Answering | DREAM (test) | Accuracy91.8 | 21 | |
| Robot Pose Estimation | DREAM-real Panda 3CAM-AK | AUC90.2 | 19 | |
| Video Captioning / Summarization | Dream 1k | Rouge-L20.8 | 15 | |
| Dialogue Comprehension | DREAM | Accuracy69.2 | 15 | |
| Safety evaluation against dynamic adversarial chains | DREAM (test) | Overall Defense Score67.3 | 12 | |
| Robot Pose Estimation | DREAM-real Panda ORB | AUC87.6 | 12 | |
| Robot Pose Estimation | DREAM-real Panda 3CAM-RS | AUC91.9 | 12 | |
| Fine-grained captioning | Dream1k | F1 Score29.5 | 11 | |
| Video Description | DREAM-1K Overall 1.0 (test) | F1 Score40.1 | 11 | |
| Video Description | DREAM-1K Stock 1.0 (test) | F1 Score44 | 11 | |
| Video Description | DREAM-1K Shorts 1.0 (test) | F1 Score40.9 | 11 | |
| Video Description | DREAM-1K YouTube 1.0 (test) | F1 Score34.5 | 11 | |
| Video Description | DREAM-1K Animation 1.0 (test) | F137.1 | 11 | |
| Video Captioning | Dream-1K | Precision36 | 10 | |
| Dialogue-based Multiple-choice Question Answering | DREAM (dev) | Accuracy89.9 | 10 | |
| Quality-Penalized Efficiency Evaluation | Dream Base | QPS (gamma=4)2.03 | 9 | |
| Question Answering | DREAM | Accuracy69.51 | 9 | |
| Robot Pose Estimation | DREAM Baxter DR | AUC75.5 | 8 |