Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA General Question Answering benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
General Question Answering
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
TriviaQA
Agentic-R
Exact Match
69.02
54
1mo ago
NQ
SkillOrchestra+
Exact Match (EM)
54.8
52
1mo ago
PopQA
InForage-instruct
EM
45.2
51
1mo ago
General QA NQ, TriviaQA, PopQA
Qwen2.5-32B-Instruct + Proposed Method
NQ Accuracy
58.2
40
25d ago
NQ (Natural Questions)
Search-R1
EM
46.1
32
1mo ago
TriviaQA (test val)
DeepControl
EM
68.2
24
3mo ago
Natural Questions (NQ) (test val)
DeepControl
EM
55.8
24
3mo ago
TriviaQA
Oracle
Token-Level F1
76.4
20
1mo ago
PopQA
SkillOrchestra+
Accuracy
48.8
18
3mo ago
TriviaQA
SkillOrchestra+
Accuracy
80.2
18
3mo ago
TriviaQA
Search-R1
EM
64.4
18
3mo ago
TriviaQA
π-Play
Score
64.6
16
1mo ago
PopQA out-of-domain (val test)
Search-R2
Exact Match (EM)
50.1
15
3mo ago
TriviaQA out-of-domain (val test)
Search-R2
EM
70.9
15
3mo ago
NQ (Natural Questions) in-domain (val/test)
Search-R2
Exact Match
50.9
15
3mo ago
PopQA Pop.
StepSearch-base
OSR Score
27.7
13
1mo ago
TriviaQA
StepSearch-base
OSR (%)
12.2
13
1mo ago
NQ (Natural Questions)
StepSearch-base
OSR (%)
25.2
13
1mo ago
OOD Non-Math GPQA, CommonsenseQA
Baseline LLM
Pass@1
71.4
12
2mo ago
TriviaQA (test)
Search-R1 + EKA
F1
66.1
11
3mo ago
PopQA
Dynamic Collaboration Framework
EM
47.87
10
1mo ago
TriviaQA
CoT
EM
66.3
10
1mo ago
NQ
Dynamic Collaboration Framework
EM
46.08
10
1mo ago
PopQA (test)
Workflow-R1-Search
EM
49.3
10
3mo ago
TriviaQA (test)
Workflow-R1-Search
EM
73.3
10
3mo ago
Showing 25 of 31 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs