| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Planning | Blocksworld (test) | Accuracy97 | 21 | |
| Planning | Blocksworld | Blocksworld Accuracy97 | 21 | |
| Planning | BlocksWorld | Success Rate100 | 20 | |
| Agent Task | BlocksWorld | Success Rate100 | 17 | |
| Agent Behavior Adaptation | BlocksWorld (BW) (test) | Loop Ratio51 | 17 | |
| Generalized Planning | Blocksworld | Scale55 | 12 | |
| Planning | Blocks (Blocksworld) | Accuracy100 | 12 | |
| Planning | Blocksworld unseen problems | Completion Rate100 | 11 | |
| Planning | Blocksworld known optimal problems | Optimal Rate1 | 11 | |
| Spatial Reasoning | Blocksworld 5-7 | Completion Rate30.5 | 10 | |
| Planning | Blocksworld | Completion Rate100 | 9 | |
| Planning Coverage | Blocksworld 30 tasks Autoscale (test) | Coverage16 | 6 | |
| HTN Planning | Blocksworld GTOHP | Coverage30 | 6 | |
| Planning | Blocksworld (test) | Average Solving Time (s)0.39 | 5 | |
| Task and Motion Planning | Blocksworld n=6 | Success Rate80 | 4 | |
| Task and Motion Planning | Blocksworld n=5 | Success Rate (SR)100 | 4 | |
| Task and Motion Planning | Blocksworld n=4 | Success Rate (%)90 | 4 | |
| Task and Motion Planning | Blocksworld (n=3) | Success Rate100 | 4 | |
| Planning Efficiency | Blocksworld Planning | Ntokens589.3 | 4 | |
| Heuristic Planning | blocksworld p23 | Expansion Rate (states/sec)619,260 | 3 | |
| Planning | blocksworld-8b ML | Accuracy100 | 3 | |
| Next-token prediction | blocksworld-8b (test) | Accuracy99.8 | 3 | |
| Next-token prediction | blocksworld 8b (train) | Accuracy100 | 3 | |
| Planning | Blocksworld 1000 samples (test) | Plan Length40.74 | 2 | |
| Planning | Blocksworld 26-100 blocks (test) | Completion Rate (%)97.5 | 2 |