Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Single-Hop Search-augmented Question Answering on PopQA
Loading...
41.8
Success Rate
+ DS-Adapter
29.84
32.945
36.05
39.155
May 29, 2026
Success Rate
Updated 2d ago
Evaluation Results
Method
Method
Links
Success Rate
+ DS-Adapter
Backbone=Qwen3-4B
2026.05
41.8
+ MASA
Backbone=Qwen3-14B
2026.05
40.7
No Skill
Backbone=Qwen3-14B
2026.05
40.6
+ DS-Adapter
Backbone=Qwen3-32B
2026.05
40.6
+ MASA
Backbone=Qwen3-32B
2026.05
40
+ Base Skill
Backbone=Qwen3-14B
2026.05
39.5
+ DS-Adapter
Backbone=Qwen3-14B
2026.05
39.5
+ Base Skill
Backbone=Qwen3-32B
2026.05
39.3
+ MASA
Backbone=Qwen3-8B
2026.05
39
+ MASA
Backbone=Qwen3-4B
2026.05
38.9
+ Base Skill
Backbone=Qwen3-8B
2026.05
38.8
+ DS-Adapter
Backbone=Qwen3-8B
2026.05
38.7
No Skill
Backbone=Qwen3-32B
2026.05
38.3
+ Base Skill
Backbone=Qwen3-4B
2026.05
38.2
No Skill
Backbone=Qwen3-4B
2026.05
37.2
No Skill
Backbone=Qwen3-8B
2026.05
30.3
Feedback
Search any
task
Search any
task