| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Video Matting | Real-world benchmark | MAD0.19 | 8 | |
| Failure Reasoning and Correction | Real-World Benchmark (test) | ROUGE-L62.1 | 7 | |
| Unseen Robot Adaptation | Real-world Benchmark | Motion Consistency8.47 | 5 | |
| Single Image Super-Resolution | Real-world benchmark (test) | NIQE10.01 | 4 | |
| Robot Manipulation | Real-World Benchmark 10 Tasks | Sequential Success Rate75 | 3 | |
| Failure Recovery | Real-World Benchmark | Recovery Rate46 | 2 |