Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tokenization on Wikipedia Cyrillic-rich subsets (test)
Loading...
2.38
Russian (ru)
T-pro
2.3672
2.4536
2.54
2.6264
Dec 11, 2025
Russian (ru)
Ukrainian (uk)
Belarusian (be)
Bulgarian (bg)
Serbian (sr)
Macedonian (mk)
Kazakh (kk)
Kyrgyz (ky)
Average Tokenization Metric
Updated 4d ago
Evaluation Results
Method
Method
Links
Russian (ru)
Ukrainian (uk)
Belarusian (be)
Bulgarian (bg)
Serbian (sr)
Macedonian (mk)
Kazakh (kk)
Kyrgyz (ky)
Average Tokenization Metric
T-pro
Tokenizer Type=Cyrilli...
2025.12
2.38
2.8
2.94
2.35
2.62
2.41
3.07
3.09
2.71
Ruadapt-Qwen3
2025.12
2.43
3.29
3.54
2.5
3.07
2.7
4.6
3.97
3.26
GigaChat
Model Version=GigaChat...
2025.12
2.49
3.09
3.32
2.58
2.97
2.67
2.67
3.33
2.89
gpt-oss
2025.12
2.7
2.92
3.03
2.56
2.73
2.59
3.11
3.17
2.85
Feedback
Search any
task
Search any
task