| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| AsyncHow | DeepSeek-V4-Flash | Makespan Accuracy98.44 | 15 | 1d ago | |
| NL-AAVE (test) | Accuracy72.4 | 7 | 3mo ago | ||
| NL (test) | Graph (40 steps) + NL (40 steps) | Accuracy78.2 | 7 | 3mo ago | |
| NL (train) | Graph (40 steps) + NL (40 steps) | Accuracy87.3 | 5 | 3mo ago | |
| Robotouille | PDDL2.1 Formalizer | Makespan Accuracy17.5 | 3 | 1d ago |