BERTje: A Dutch BERT Model
About
The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks. Using the same architecture and parameters, we developed and evaluated a monolingual Dutch BERT model called BERTje. Compared to the multilingual BERT model, which includes Dutch but is only based on Wikipedia text, BERTje is based on a large and diverse dataset of 2.4 billion tokens. BERTje consistently outperforms the equally-sized multilingual BERT model on downstream NLP tasks (part-of-speech tagging, named-entity recognition, semantic role labeling, and sentiment analysis). Our pre-trained Dutch BERT model is made available at https://github.com/wietsedv/bertje.
Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, Malvina Nissim• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Named Entity Recognition | CoNLL-2002 (test) | F1 Score88.3 | 7 | |
| Die/Dat Disambiguation | Europarl | ACC98.268 | 5 | |
| Part-of-Speech Tagging | Lassy UD (test) | Accuracy96.3 | 5 | |
| Sentiment Analysis | 110k Dutch Book Reviews Dataset (test) | Accuracy93 | 4 | |
| Die/Dat Disambiguation | Europarl 10k | Accuracy93.096 | 4 | |
| Sentiment Analysis | DBRD (Full dataset) | Accuracy93 | 4 | |
| Part-of-Speech Tagging | UD-LassySmall 2.5 (train) | Accuracy99.6 | 3 | |
| Part-of-Speech Tagging | UD-LassySmall 2.5 (dev) | Accuracy96.8 | 3 | |
| Part-of-Speech Tagging | UD-LassySmall 2.5 (test) | Accuracy0.966 | 3 | |
| Part-of-Speech Tagging | SoNaR-1 coarse (train) | Accuracy99.8 | 3 |
Showing 10 of 25 rows