| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Long-context language modeling | Matched Quality Evaluation Suite 128K | Q Score (Dense)80.12 | 3 | |
| Long-context language modeling | Matched Quality Evaluation Suite (32K) | Q* Score (Dense)79.39 | 3 | |
| Long-context language modeling | Matched Quality Evaluation Suite 8K | Q* Score (Dense)81.35 | 3 |