Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

QASPER

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question Answeringqasper
F1 Score36.9
61
Single-document retrievalQasper
F1 Score50.3
44
Language GenerationQASPER
Accuracy15.35
35
Single-hop Question AnsweringQasper
Score44.79
22
Question AnsweringQASPER 1200:251 (test)
Answerable EM28.92
20
Long-context Question AnsweringQasper
F183.09
17
Question AnsweringQasper
F1 Score0.3677
16
Question AnsweringQASPER (test)
F1 Score (Match)55.7
16
Question AnsweringQasper
Recall67.3
15
Multi-session Retrieval-Augmented GenerationQASPER (test)
F1 Score36
12
Speculative DecodingQasper
SR1.66
12
Document Question AnsweringQasper
Accuracy0.552
11
Single-document retrievalQasper
Latency (s)0.0054
11
Long document retrievalQasper (test)
F1 Score46.18
11
CompletenessQASPER
Kendall's Tau0.44
11
Long-context Question AnsweringQasper
Extract F154.57
10
Faithfulness EvaluationQasper yes/no question answering
AOPC@100.102
10
Question AnsweringQasper (val)
F128.8
10
Question AnsweringQASPER Multi-Document 4
Accuracy76.2
9
Question AnsweringQASPER Extractive (test)
F153.3
8
Question AnsweringQASPER Extractive (dev)
F129.6
8
RAG-CompletenessQASPER (test)
Mean Kendall Tau Correlation0.44
6
Document RetrievalQasper
R@10.25
2
Question AnsweringQASPER SCROLLS (val)
F1 Score0.275
2
Question AnsweringQasper (dev)
F140.6
2
Showing 25 of 25 rows