Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QuALITY

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringQuALITY (test)
Accuracy77.71
90
Multiple-Choice Question AnsweringQuALITY
Accuracy56.997
19
Question AnsweringQUALITY
Exact Match83.8
10
Multiple-Choice Question AnsweringQuALITY Hard Subset
Accuracy62.9
6
Multiple-Choice Question AnsweringQuALITY (test)
Accuracy74.7
6
Reading ComprehensionQuaLITY
P@176.4
6
Long-context Question AnsweringQuALITY
Accuracy73.63
5
Question AnsweringQuALITY multiple choice (test)
Accuracy86.09
4
Reading ComprehensionQuALITY 0-shot
Accuracy40.9
4
Question AnsweringQuALITY ZeroSCROLLS leaderboard (test)
Accuracy72.8
4
Question AnsweringQuALITY hard
Accuracy76.2
4
Reading ComprehensionQuALITY (test)
Accuracy87.6
3
Question AnsweringQuALITY 0-shot
Log Accuracy38.9
2
Question AnsweringQuALITY (dev)
Exact Match37.6
2
Showing 14 of 14 rows