| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Long-context language modeling | ZeroSCROLLS (test) | GovReport Score35.8 | 24 | |
| Long-context understanding | ZeroSCROLLS (val) | QuALITY EM95.2 | 6 | |
| Question Answering | ZeroSCROLLS SQuALITY (test) | ROUGE GM17 | 2 | |
| Summarization | ZeroSCROLLS SpaceDigest (test) | ES77.9 | 2 | |
| Question Answering | ZeroSCROLLS MuSiQue (test) | F1 Score52.2 | 2 |