| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Hallucination detection | NQ-Open | AUROC0.8843 | 61 | |
| Question Answering | NQ-Open (val) | Accuracy30.7 | 28 | |
| Question Answering | NQ-Open In-Domain (test) | Precision58.13 | 26 | |
| Factual Question Answering | NQ-Open ID | Precision57.34 | 24 | |
| Question Answering | NQ-open v1.0 (test) | A179.08 | 16 | |
| Hallucination Detection | NQ Open (test) | AUROC89.4 | 14 | |
| Question Answering | NQ-Open (out-of-domain) | Precision0.705 | 12 |