| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Next-Step-Prediction Style Planning | Cosmos Reason | Performance66.82 | 16 | |
| Planning | Cosmos | Score66.36 | 10 | |
| Log-likelihood Estimation | COSMOS 2020 (test) | Mean0 | 9 | |
| Embodied Reasoning | Cosmos R1 | Score81 | 9 | |
| Multi-modal Forgery Detection | COSMOS | ACC53.78 | 5 | |
| Cross-modal consistency verification | COSMOS | Accuracy56.85 | 5 |