| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-answer Question Answering | MAQA-ΔK−1 | KL Divergence-0.149 | 48 | |
| Uncertainty Quantification | MAQA ∆K−1 | KL Divergence AUC0.757 | 28 | |
| Uncertainty Quantification | MAQA | Hamming AUC83.5 | 28 | |
| Multi-answer Question Answering | MAQA | Hamming Distance0.04 | 28 | |
| Multi-answer Question Answering (Sets) | MAQA {0, 1}^K | Hamming Score102.3 | 20 | |
| Question Answering | MAQA | Accuracy0.635 | 7 | |
| Error Detection | MAQA* High-Ambiguity Subset H[p*] >= 1.5 (test) | AUROC65 | 5 | |
| Classification | MAQA (test) | Accuracy63.5 | 5 |