| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | WQ (test) | AUROC76.6 | 90 | |
| Question Answering | WQ | Absolute Execution Time Overhead (s)0.039 | 90 | |
| Question Answering | WQ | PRR62.8 | 90 | |
| Open-Domain Question Answering | WQ (test) | EM33.71 | 37 | |
| Reward Modeling | WQ Arena | Accuracy65.29 | 22 | |
| Inference Efficiency | WQ | Relative Execution Time Overhead0.014 | 12 | |
| Open-domain retrieval | WQ | Recall@2073.2 | 9 | |
| Question Answering | WQ | Accuracy45.5 | 8 |