| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Long-context language understanding suite | ZeroSCROLLS | GovReport Score33.5 | 24 | |
| Long-context language understanding | SCROLLS (test) | Average Score47.4 | 18 | |
| Question Answering | Scrolls NarraQA | Accuracy12.29 | 10 | |
| Question Answering | Scrolls QAsper | Accuracy14.8 | 10 | |
| Summarization | SCROLLS | ROUGE-114.83 | 8 | |
| Long-context language understanding | SCROLLS (dev) | GovRep ROUGE-157.4 | 7 | |
| Long-context Open-book Question Answering and Summarization | SCROLLS (val) | NaQA F1 Score23.9 | 6 |