Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Single-Hop Question Answering on TriviaQA 2018 Wikipedia dump (dev)
Loading...
66.6
Accuracy
MR-Search
27.288
37.494
47.7
57.906
Mar 11, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
MR-Search
Backbone=Qwen2.5-7b
2026.03
66.6
StepResearch
Backbone=Qwen2.5-7b
2026.03
63.6
MR-Search
Backbone=Qwen2.5-3b
2026.03
63.5
Search-R1
Backbone=Qwen2.5-7b
2026.03
63.2
Search-R1
Backbone=Qwen2.5-3b
2026.03
62.2
StepResearch
Backbone=Qwen2.5-3b
2026.03
61.5
PPRM
Backbone=Qwen2.5-7b
2026.03
61
ReSearch
Backbone=Qwen2.5-7b
2026.03
60.5
ReSearch
Backbone=Qwen2.5-3b
2026.03
59.7
PPRM
Backbone=Qwen2.5-3b
2026.03
56.5
Search-o1
Backbone=Qwen2.5-3b
2026.03
47.2
Search-o1
Backbone=Qwen2.5-7b
2026.03
44.3
Direct Inference
Backbone=Qwen2.5-7b
2026.03
40.8
Direct Inference
Backbone=Qwen2.5-3b
2026.03
28.8
Feedback
Search any
task
Search any
task