| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Tokenization Efficiency | 1.5-million sentence evaluation corpus Overall | Other Tokens68,739,586 | 3 | |
| Tokenization Efficiency | 1.5-million sentence evaluation corpus English | Other Tokens Count7,904,670 | 3 | |
| Tokenization Efficiency | 1.5-million sentence evaluation corpus Hindi | Other Tokens18,394,075 | 3 | |
| Tokenization Efficiency | 1.5-million sentence evaluation corpus (Sinhala) | Other Token Count17,360,196 | 3 |