| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Long-horizon robot manipulation | Calvin ABCD→D | Task 1 Completion Rate99.4 | 96 | |
| Long-horizon task completion | Calvin ABC->D | Success Rate (1)96.8 | 67 | |
| Robot Manipulation | CALVIN (ABC->D) | Average Successful Length4.75 | 36 | |
| Instruction-following robotic manipulation | CALVIN ABC→D (unseen environment D) | Success Rate (Length 1)98.5 | 29 | |
| Robotic Manipulation | CALVIN ABCD->D | Success Rate (1 Inst)99.7 | 26 | |
| Robot Manipulation | CALVIN ABC->D 1.0 | Success Rate (1 Inst)96.8 | 18 | |
| Robotic Manipulation | Calvin ABC-D | Task-1 Score100 | 16 | |
| Long-horizon robotic manipulation | CALVIN ABC→D Zero-shot | Task 1 Success Rate98.8 | 16 | |
| Long-horizon robot manipulation | CALVIN | Task Completion Rate (1)96.3 | 15 | |
| Long-horizon task completion | CALVIN | Success Rate (1 Task)93.8 | 15 | |
| Long-Horizon Multi-Task Language Control | CALVIN ABC→D (test) | Seq Success (1)96 | 13 | |
| Robotic Manipulation | CALVIN D→D | Success Rate (Length 1)93.7 | 12 | |
| Long-horizon task success | CALVIN D→D long-horizon | Success Rate (LH-1)99.5 | 11 | |
| Robot manipulation | CALVIN 10% ABCD → D | Success Rate (L=1)84.1 | 11 | |
| Failure Detection | DSMF-CALVIN (test) | Accuracy90.64 | 10 | |
| Language-conditioned visuomotor control | CALVIN ABC→D (Zero-shot) | Completion Rate (Seq 1)96 | 8 | |
| Track prediction | CALVIN ABC → D (test) | Success Rate (δ < 4)43.7 | 7 | |
| Robot Manipulation | CALVIN D->D | Average Successful Length2.92 | 6 | |
| Video Generation | Calvin (val) | PSNR19.95 | 5 | |
| Robotic Manipulation | CALVIN 10% of Env D | No-RGB Success Rate65.2 | 4 | |
| Multitask Imitation Learning | CALVIN Enriched instructions (D) | Success Rate (Task 1)73.7 | 4 | |
| Long-horizon robot manipulation | CALVIN unseen lang | Task Completion Rate (1 Task)76.4 | 4 | |
| Long-horizon robot manipulation | CALVIN 10% data | Task 1 Completion Rate77.8 | 4 | |
| Multi-task Robotic Manipulation | CALVIN (test) | Success Rate (1 task)80.3 | 4 | |
| Manipulation | CALVIN ABC -> D | Success Rate92.2 | 3 |