Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA End-to-end Question Answering benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
End-to-end Question Answering
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
FinSlides
SlideAgent
Overall Score
85.5
25
1mo ago
TechSlides
SlideAgent
Overall Score
70.9
25
1mo ago
MultiHopRAG (test val)
SetR-CoT & IRI
Accuracy
47.14
20
3mo ago
MuSiQue (test val)
SetR-CoT & IRI
EM
10.79
20
3mo ago
2WikiMultiHopQA (test val)
SetR-CoT & IRI
EM
35.44
20
3mo ago
HotpotQA (test val)
SetR-Selection only
EM
36.68
20
3mo ago
MultiHop-RAG overall
DPP Base, w/o P-Adapter
Exact Match
50.61
12
1mo ago
MultiHop-RAG 4-hop
DPP Base, w/o P-Adapter
Exact Match
74.21
12
1mo ago
MultiHop-RAG 3-hop
ScalDPP
Exact Match
48.38
12
1mo ago
MultiHop-RAG 2-hop
DPP Base, w/o P-Adapter
Exact Match
45.09
12
1mo ago
MuSiQue (test)
INTRA
EM
14
9
26d ago
HotPotQA (test)
INTRA
EM
0.464
9
26d ago
DoRA IBM Granite retriever (test)
DoRA SFT
Token F1 Score
57.52
9
1mo ago
DoRA full (test)
DoRA SFT
Token F1 Score
56.53
9
1mo ago
HotpotQA official Wikipedia paragraphs
HopRetriever
EM
67.1
9
3mo ago
MMCoQA (test)
LILaC (w/ MM-Embed)
EM
36.31
7
3mo ago
MultimodalQA (test)
LILaC (w/ MM-Embed)
EM
44.57
7
3mo ago
InfoVQA (test)
LILaC (w/ MM-Embed)
EM
60.91
7
3mo ago
MP-DocVQA (test)
LILaC (w/ MM-Embed)
EM
65.48
7
3mo ago
AGORA (test)
Base ColBERT / DPO Mistral
Relevancy
74.7
6
2mo ago
OTT-QA
ARM
Accuracy
0.317
6
3mo ago
Bird
ARM
Accuracy
20.6
6
3mo ago
MCP-Bench
TURA
Accuracy (Human)
87.5
4
2mo ago
InfoSeek 25-sample perturbed subset
Oracle
Rotation
52
4
3mo ago
HotpotQA ANCE
ACQO
MAP@10
49.6
3
3mo ago
Showing 25 of 26 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs