SlovakBERT: Slovak Masked Language Model
About
We introduce a new Slovak masked language model called SlovakBERT. This is to our best knowledge the first paper discussing Slovak transformers-based language models. We evaluate our model on several NLP tasks and achieve state-of-the-art results. This evaluation is likewise the first attempt to establish a benchmark for Slovak language models. We publish the masked language model, as well as the fine-tuned models for part-of-speech tagging, sentiment analysis and semantic textual similarity.
Mat\'u\v{s} Pikuliak, \v{S}tefan Grivalsk\'y, Martin Kon\^opka, Miroslav Bl\v{s}t\'ak, Martin Tamajka, Viktor Bachrat\'y, Mari\'an \v{S}imko, Pavol Bal\'a\v{z}ik, Michal Trnka, Filip Uhl\'arik• 2021
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic Textual Similarity | SICK Slovak (val) | Pearson Correlation0.7515 | 33 | |
| Semantic Textual Similarity | STS Benchmark Slovak (val) | Pearson Correlation0.7537 | 33 | |
| Sentiment Analysis | Slovak Sentiment Analysis | F1 Score70.98 | 2 |
Showing 3 of 3 rows