Typologically Informed Parameter Aggregation
About
Massively multilingual language models enable cross-lingual generalization but underperform on low-resource and unseen languages. While adapter-based fine-tuning offers a parameter-efficient solution, training language-specific adapters at scale remains costly. We introduce Typologically Informed Parameter Aggregation (TIPA), a training-free method that constructs proxy language adapters by aggregating existing ones, weighted by typological similarity. Integrated into the MAD-X framework, these proxies enable zero-shot cross-lingual transfer without additional training. We evaluate TIPA on five NLP tasks and over 230 languages. TIPA consistently outperforms or matches baselines such as English-only fine-tuning or selecting the typologically closest language adapter. We see the largest gains for languages lacking dedicated adapters. Our results demonstrate that typologically informed aggregation provides a viable alternative to language-specific modules without any training needed.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multilingual Natural Language Processing | Aggregate 234 languages | Score54.1 | 5 | |
| Named Entity Recognition | NER 136 languages | Overall Score51.3 | 5 | |
| Question Answering | QA 12 languages | Score72.9 | 5 | |
| Topic Classification | SIB 176 languages | Score63.4 | 5 | |
| Part-of-Speech Tagging | POS 80 languages | Score46.8 | 5 | |
| Choice of Plausible Alternatives | COPA 11 languages | Score51.8 | 5 |