| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | AmbigQA | Cover EM60 | 18 | |
| Question Answering | AmbigQA | EM61.3 | 11 | |
| Disambiguation and completeness | AmbigQA | Personalization Bias0.113 | 9 | |
| Question Answering with Clarification | AmbigQA Unambiguous queries (dev) | Reward42.05 | 8 | |
| Question Answering with Clarification | AmbigQA Ambiguous queries (dev) | Reward15.81 | 8 | |
| Question Answering | AmbigQA | Accuracy59.8 | 7 | |
| Open-Domain QA | AmbigQA Nq=300 | Acc0.473 | 6 | |
| Question Answering | AmbigQA | Helpfulness4.96 | 5 | |
| Question Answering | AmbigQA (sampled) | Accuracy65.5 | 4 | |
| Multi-answer Question Answering | AMBIGQA (test) | F1 (All Questions)46.2 | 3 | |
| Multi-answer Question Answering | AMBIGQA (dev) | F1 (all questions)52.1 | 3 |