| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AgentBench | DPO – ADARUBRIC-DA | Success Rate34.1 | 8 | 25d ago | |
| Mini-ARC | Success Rate52.03 | 6 | 1mo ago | ||
| LIBERO-Plus zero-shot | A1-FM | Spatial Success Rate (Zero-Shot)86.6 | 5 | 11d ago | |
| KUKA Stacking 100 samples | Hierarchical GPMP | Success Rate (KUKA Stacking 100)70 | 4 | 1mo ago | |
| Maze2D 100 samples | Hierarchical GPMP | Success Rate81 | 4 | 1mo ago | |
| ACRE | Success Rate93 | 4 | 1mo ago | ||
| Bongard-LOGO | Two-stage pipeline (o1 + GPT-4o) | Success Rate80 | 4 | 1mo ago | |
| LIBERO-CF | X-VLA | CF Spatial Faithful8.9 | 3 | 1mo ago |