Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-Context Encoder Models for Polish Language Understanding

About

While decoder-only Large Language Models (LLMs) have recently dominated the NLP landscape, encoder-only architectures remain a cost-effective and parameter-efficient standard for discriminative tasks. However, classic encoders like BERT are limited by a short context window, which is insufficient for processing long documents. In this paper, we address this limitation for the Polish by introducing a high-quality Polish model capable of processing sequences of up to 8192 tokens. The model was developed by employing a two-stage training procedure that involves positional embedding adaptation and full parameter continuous pre-training. Furthermore, we propose compressed model variants trained via knowledge distillation. The models were evaluated on 25 tasks, including the KLEJ benchmark, a newly introduced financial task suite (FinBench), and other classification and regression tasks, specifically those requiring long-document understanding. The results demonstrate that our model achieves the best average performance among Polish and multilingual models, significantly outperforming competitive solutions in long-context tasks while maintaining comparable quality on short texts.

S{\l}awomir Dadas, Rafa{\l} Po\'swiata, Marek Koz{\l}owski, Ma{\l}gorzata Gr\k{e}bowiec, Micha{\l} Pere{\l}kiewicz, Pawe{\l} Klimiuk, Przemys{\l}aw Boruta• 2026

Related benchmarks

TaskDatasetResultRank
Financial Language UnderstandingFinBench 7 tasks (val)
FinBench Score85.19
13
General Language UnderstandingAll tasks (25 tasks) (val)
Overall Accuracy85.93
13
Language UnderstandingOther tasks (9 tasks) (val)
Other Tasks Score83.92
13
Language UnderstandingKLEJ 9 tasks (val)
KLEJ Score88.52
13
Long-context Language UnderstandingLong tasks 4 tasks (val)
Long Tasks Score83.16
13
Binary ClassificationIMDB
Accuracy96.03
9
Financial NLPFinBench
Banking-Short Accuracy81.99
3
General Polish Language UnderstandingAverage 25 Tasks
Average Score85.93
3
Multi-Label ClassificationMIPD
Weighted F168.5
3
Multi-Label ClassificationEURLEX
Weighted F179.77
3
Showing 10 of 17 rows

Other info

Follow for update