| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | QUALITY | Exact Match83.8 | 10 | |
| Multiple-Choice Question Answering | QuALITY Hard Subset | Accuracy62.9 | 6 | |
| Multiple-Choice Question Answering | QuALITY (test) | Accuracy74.7 | 6 | |
| Reading Comprehension | QuaLITY | P@176.4 | 6 | |
| Multiple-Choice Question Answering | QuALITY | Accuracy38 | 6 | |
| Question Answering | QuALITY (test) | F1 Score39.25 | 6 | |
| Question Answering | QuALITY multiple choice (test) | Accuracy86.09 | 4 | |
| Reading Comprehension | QuALITY 0-shot | Accuracy40.9 | 4 | |
| Question Answering | QuALITY ZeroSCROLLS leaderboard (test) | Accuracy72.8 | 4 | |
| Question Answering | QuALITY hard | Accuracy76.2 | 4 | |
| Reading Comprehension | QuALITY (test) | Accuracy87.6 | 3 | |
| Question Answering | QuALITY 0-shot | Log Accuracy38.9 | 2 | |
| Question Answering | QuALITY (dev) | Exact Match37.6 | 2 |