| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Planning | BlocksWorld | Success Rate100 | 20 | |
| Agent Task | BlocksWorld | Success Rate100 | 17 | |
| Agent Behavior Adaptation | BlocksWorld (BW) (test) | Loop Ratio51 | 17 | |
| Generalized Planning | Blocksworld | Scale55 | 12 | |
| Planning | Blocks (Blocksworld) | Accuracy100 | 12 | |
| Task and Motion Planning | Blocksworld n=6 | Success Rate80 | 4 | |
| Task and Motion Planning | Blocksworld n=5 | Success Rate (SR)100 | 4 | |
| Task and Motion Planning | Blocksworld n=4 | Success Rate (%)90 | 4 | |
| Task and Motion Planning | Blocksworld (n=3) | Success Rate100 | 4 | |
| Planning Efficiency | Blocksworld Planning | Ntokens589.3 | 4 | |
| Planning | blocksworld-8b ML | Accuracy100 | 3 | |
| Next-token prediction | blocksworld-8b (test) | Accuracy99.8 | 3 | |
| Next-token prediction | blocksworld 8b (train) | Accuracy100 | 3 |