| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Procedure Planning | CrossTask | Success Rate (SR)40.45 | 43 | |
| Goal-conditioned visual planning | CrossTask T=4 88 (test) | SR37.04 | 40 | |
| Action Step Localization | CrossTask (test) | Recall52.5 | 32 | |
| Action step localization | CrossTask | Average Recall47.3 | 28 | |
| Goal-conditioned visual planning | CrossTask T=3 88 | Success Rate (SR)47.47 | 27 | |
| Procedure Planning | CrossTask T=3 (test) | SR41.14 | 27 | |
| Visual Planning | CrossTask | Success Rate (SR)38.45 | 22 | |
| Online Action Detection | CrossTask | P-F134.5 | 20 | |
| Keystep recognition | CrossTask (test) | Accuracy28.9 | 18 | |
| Keystep recognition | CrossTask | Accuracy64.5 | 17 | |
| Procedure Planning | CrossTask T=5 | Success Rate14.2 | 15 | |
| Goal-conditioned visual planning | CrossTask T=3 88 (test) | Success Rate (SR)51.71 | 13 | |
| Procedure Learning | CrossTask | Precision60.9 | 13 | |
| Consistent Video Retrieval | CrossTask (test) | Accuracy0.6436 | 13 | |
| Keystep forecasting | CrossTask | Accuracy30.2 | 12 | |
| Task recognition | CrossTask | Accuracy97.1 | 12 | |
| Procedure Planning | CrossTask short horizon T=3 | SR37.96 | 11 | |
| Weakly-supervised Action Segmentation | CrossTask | MoF54 | 11 | |
| Procedure Planning | CrossTask short horizon T=4 | SR22.56 | 10 | |
| Procedure Planning | CrossTask long horizons T=6 | Success Rate (SR)9.27 | 10 | |
| Action Segmentation | CrossTask | F1 Score61.4 | 9 | |
| Temporal Action Localization | CrossTask | Recall41.4 | 9 | |
| Temporal Action Localization | CrossTask (test) | Recall0.414 | 9 | |
| Procedure Planning | CrossTask T=4 (test) | SR16.41 | 8 | |
| Step localization | CrossTask | Recall49.7 | 8 |