| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Hallucination Detection | SelfAware Gemini outputs (test) | AUROC52.8 | 15 | |
| Hallucination Detection | SelfAware GPT outputs (test) | AUROC0.528 | 15 | |
| Hallucination Detection | SelfAware Llama outputs (test) | AUROC58.7 | 15 | |
| Factuality | SelfAware | Score0.372 | 10 | |
| Self-awareness | SelfAware | Accuracy51.2 | 10 | |
| Hallucination Detection | SelfAware | AUROC0.587 | 9 | |
| Question Answering with Abstention | SELFAWARE | U-Ref91.4 | 7 | |
| Question Answering | SelfAware (out-of-domain) | nAUPC9.9 | 4 | |
| Question Answering | SelfAware | Accuracy27 | 1 |