Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sentence Evaluation Corpus

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tokenization Efficiency1.5-million sentence evaluation corpus Overall
Other Tokens68,739,586
3
Tokenization Efficiency1.5-million sentence evaluation corpus English
Other Tokens Count7,904,670
3
Tokenization Efficiency1.5-million sentence evaluation corpus Hindi
Other Tokens18,394,075
3
Tokenization Efficiency1.5-million sentence evaluation corpus (Sinhala)
Other Token Count17,360,196
3
Showing 4 of 4 rows