| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-hop Question Answering | MoreHopQA | Accuracy86.4 | 25 | |
| Uncertainty Estimation | MoreHopQA Camel | AUROC65.29 | 16 | |
| Multi-hop Question Answering | MorehopQA | AUROC0.6457 | 16 | |
| Uncertainty Estimation | MoreHopQA AutoGen (test) | AUROC63.92 | 16 | |
| Open-ended Question Answering | MoreHopQA (test) | Accuracy77 | 11 | |
| Multi-hop QA Retrieval | MoreHopQA | NDCG0.908 | 5 | |
| Question Answering | MoreHopQA | Inference Time (s)7.27 | 4 |