Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

L-Eval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context language understandingL-Eval
Coursera58.28
26
Long-context language understandingL-Eval (test)
Coursera58.28
26
Long-context SummarizationL-Eval Sum
QMS22.66
13
Long-context Question AnsweringL-Eval QA
NQ80.73
13
Long-context evaluationL-Eval
Close Score68.8
13
Closed-ended Task EvaluationL-Eval closed-ended tasks
Coursera Score41.86
12
Showing 6 of 6 rows