Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringQA Zero-shot Average
QA Zero-shot Average73.45
57
Question AnsweringQA
Speedup Factor3.66
47
Legal Text ClassificationQA
Accuracy85.72
18
Question AnsweringQA
Accuracy59.5
12
SteeringQA
Steering Success62.5
11
Question AnsweringQA benchmarks
ReCoRD Score80.86
9
Question AnsweringQA domain average
Best Accuracy85.2
8
Critique Quality EvaluationQA
Win Rate75
6
Question AnsweringQA 12 languages
Score72.9
5
Speculative DecodingQa
Speedup2.23
3
Showing 10 of 10 rows