| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TydiQA | ByT5 | Accuracy81.9 | 65 | 1mo ago | |
| TyDiQA 1-shot macro-averaged | Filter-then-Weight | F1 Score (1-shot macro)48.86 | 28 | 16d ago | |
| TyDiQA GoldP (val) | ByT5 XXL | Ar Score80 | 20 | 11d ago | |
| M-ARC | OrthoMerge | Accuracy44.75 | 10 | 1mo ago | |
| TydiQA | Forgetting | F1 Score48.71 | 4 | 18d ago | |
| MultiLoKo 31 languages (test) | Qwen3-30B-A3B | Overall Score26 | 4 | 1mo ago | |
| ECleKTic 12 languages (test) | GPT-OSS-120B | Overall Score21.6 | 4 | 1mo ago | |
| xquad vi | ACE | Normalized Performance89.29 | 3 | 1mo ago | |
| xquad zh | ACE | Normalized Performance60.28 | 3 | 1mo ago | |
| Speech-XBelebele Text -> Text | Spectrum | Accuracy63.64 | 1 | 1mo ago |