| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Length-Constrained Text Generation | HANNA | Win Rate23 | 10 | |
| Text Generation | HANNA (test) | LCTG Error Rate2.58 | 10 | |
| Interactive Navigation | HANNA (UNSEEN-ALL) | Success Rate (SR)10,000 | 7 | |
| Interactive Navigation | HANNA (SEEN-ENV) | Success Rate10,000 | 7 | |
| Story-level evaluation | HANNA | Coherence (RP)0.678 | 6 |