Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NLP Benchmark Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Language ModelingNLP Benchmark Suite Aggregate
Average Delta-9.2
16
Aggregate NLP EvaluationNLP Benchmark Suite Average
Average Accuracy64
9
Showing 2 of 2 rows