Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deep Search QA on MuSiQue
Loading...
29.52
Accuracy
ProCeedRL
18.2048
21.1424
24.08
27.0176
Apr 2, 2026
Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
ProCeedRL
Backbone=Qwen3-8B
2026.04
29.52
DeepResearcher
2026.04
29.3
DAPO/Search-R1
Backbone=Qwen3-8B
2026.04
23.6
RFT
Backbone=Qwen3-8B
2026.04
22.49
Qwen3-8B-v3-SFT
Backbone=Qwen3-8B
2026.04
20.07
Rewinding More
Backbone=Qwen3-8B, Typ...
2026.04
19.11
GiGPO
2026.04
18.9
ReAct Prompting
Backbone=Qwen3-8B
2026.04
18.64
Feedback
Search any
task
Search any
task