Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

mT5: A massively multilingual pre-trained text-to-text transformer

About

The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. We also describe a simple technique to prevent "accidental translation" in the zero-shot setting, where a generative model chooses to (partially) translate its prediction into the wrong language. All of the code and model checkpoints used in this work are publicly available.

Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel• 2020

Related benchmarks

TaskDatasetResultRank
Natural Language UnderstandingGLUE (val)--
170
Natural Language InferenceXNLI (test)
Average Accuracy85
167
Natural Language InferenceXNLI
Accuracy87.1
111
Multimodal SummarizationMM-Sum Zero-Resource Languages (test)
ROUGE-1 Score33.67
96
Multimodal SummarizationMM-Sum low-resource 1.0
ROUGE-147.36
96
Multimodal Abstractive SummarizationMM-Sum mid-high-resource (test)
ROUGE-140.3
90
Multimodal Abstractive SummarizationMM-Sum mid-high-resource
ROUGE-140.31
90
Natural Language UnderstandingSuperGLUE (test)
BoolQ Accuracy78.1
63
Named Entity RecognitionWikiAnn (test)--
58
Paraphrase IdentificationPAWS-X
Accuracy91.5
57
Showing 10 of 194 rows
...

Other info

Follow for update