Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA End-to-end Question Answering benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
End-to-end Question Answering
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
MultiHopRAG (test val)
SetR-CoT & IRI
Accuracy
47.14
20
1mo ago
MuSiQue (test val)
SetR-CoT & IRI
EM
10.79
20
1mo ago
2WikiMultiHopQA (test val)
SetR-CoT & IRI
EM
35.44
20
1mo ago
HotpotQA (test val)
SetR-Selection only
EM
36.68
20
1mo ago
MultiHop-RAG overall
DPP Base, w/o P-Adapter
Exact Match
50.61
12
12d ago
MultiHop-RAG 4-hop
DPP Base, w/o P-Adapter
Exact Match
74.21
12
12d ago
MultiHop-RAG 3-hop
ScalDPP
Exact Match
48.38
12
12d ago
MultiHop-RAG 2-hop
DPP Base, w/o P-Adapter
Exact Match
45.09
12
12d ago
HotpotQA official Wikipedia paragraphs
HopRetriever
EM
67.1
9
1mo ago
MMCoQA (test)
LILaC (w/ MM-Embed)
EM
36.31
7
1mo ago
MultimodalQA (test)
LILaC (w/ MM-Embed)
EM
44.57
7
1mo ago
InfoVQA (test)
LILaC (w/ MM-Embed)
EM
60.91
7
1mo ago
MP-DocVQA (test)
LILaC (w/ MM-Embed)
EM
65.48
7
1mo ago
AGORA (test)
Base ColBERT / DPO Mistral
Relevancy
74.7
6
23d ago
OTT-QA
ARM
Accuracy
0.317
6
1mo ago
Bird
ARM
Accuracy
20.6
6
1mo ago
MCP-Bench
TURA
Accuracy (Human)
87.5
4
1mo ago
InfoSeek 25-sample perturbed subset
Oracle
Rotation
52
4
1mo ago
HotpotQA ANCE
ACQO
MAP@10
49.6
3
1mo ago
3+ hop challenge questions (official Wikipedia paragraphs)
IRRR
EM
32.5
3
1mo ago
Showing 20 of 20 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs