| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Question Answering | MKQA | fEM47.6 | 27 | |
| Cross-lingual retrieval | MKQA | Avg. Recall@10076.6 | 27 | |
| Multilingual Retrieval-Augmented Generation | MKQA 1.0 (test) | Accuracy (AR)0.6255 | 18 | |
| Confidence Estimation | MKQA (test) | AUROC0.82 | 14 | |
| Cross-lingual Question Answering | MKQA Average across languages | fEM45.92 | 14 | |
| Cross-lingual Question Answering | MKQA Arabic | fEM24.12 | 14 | |
| Cross-lingual Question Answering | MKQA Thai | fEM27.83 | 14 | |
| Cross-lingual Question Answering | MKQA French | fEM59.67 | 14 | |
| Cross-lingual Question Answering | MKQA English | fEM72.07 | 14 | |
| Open-domain Question Answering | MKQA | Accuracy15.94 | 12 | |
| Question Retrieval | MKQA (full) | Retrieval Accuracy29.9 | 12 | |
| Multilingual Knowledge Question Answering | MKQA (test) | F1-score (All)52.3 | 10 | |
| Downstream Generation and Ranking Alignment | MKQA (test) | 3-gram Recall49.9 | 8 | |
| Retrieval | MKQA eng | nDCG@115.1 | 6 | |
| Confidence Estimation | MKQA Japanese ja (test) | AUROC83 | 5 | |
| Confidence Estimation | MKQA Russian / ru (test) | AUROC80 | 5 | |
| Confidence Estimation | MKQA Spanish es (test) | AUROC82 | 5 | |
| Information Retrieval | MKQA eng | nDCG@1035.2 | 5 | |
| Open-domain question answering | MKQA English Language (test) | NOB Accuracy45.42 | 5 | |
| Open-domain question answering | MKQA Target Language (test) | NOB Accuracy40.64 | 5 | |
| Confidence Estimation | MKQA Japanese | AUROC77 | 2 | |
| Confidence Estimation | MKQA Russian | AUROC0.78 | 2 | |
| Confidence Estimation | MKQA Polish | AUROC0.74 | 2 | |
| Confidence Estimation | MKQA Spanish | AUROC77 | 2 | |
| Confidence Estimation | MKQA English | AUROC76 | 2 |