Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ZeroSCROLLS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context language modelingZeroSCROLLS (test)
GovReport Score35.8
24
Long-context understandingZeroSCROLLS (val)
QuALITY EM95.2
6
Question AnsweringZeroSCROLLS SQuALITY (test)
ROUGE GM17
2
SummarizationZeroSCROLLS SpaceDigest (test)
ES77.9
2
Question AnsweringZeroSCROLLS MuSiQue (test)
F1 Score52.2
2
Showing 5 of 5 rows