Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Single-Hop QA on NQ
Loading...
48.5
Accuracy
SLEA-RL
10.124
20.087
30.05
40.013
Mar 18, 2026
Accuracy
Updated 29d ago
Evaluation Results
Method
Method
Links
Accuracy
SLEA-RL
Base Model=Qwen2.5-7B-...
2026.03
48.5
IGPO
Base Model=Qwen2.5-7B-...
2026.03
46.7
GiGPO
Base Model=Qwen2.5-7B-...
2026.03
46.4
SkillRL
Base Model=Qwen2.5-7B-...
2026.03
45.9
ZeroSearch
Base Model=Qwen2.5-7B-...
2026.03
43.6
EvolveR
Base Model=Qwen2.5-7B-...
2026.03
43.5
GSPO
Base Model=Qwen2.5-7B-...
2026.03
41.5
RLOO
Base Model=Qwen2.5-7B-...
2026.03
40.7
GRPO
Base Model=Qwen2.5-7B-...
2026.03
40.3
Search-R1
Base Model=Qwen2.5-7B-...
2026.03
39.3
PPO
Base Model=Qwen2.5-7B-...
2026.03
38.7
Reinforce++
Base Model=Qwen2.5-7B-...
2026.03
34.3
RAG
Base Model=Qwen2.5-7B-...
2026.03
27.4
R1-Instruct
Base Model=Qwen2.5-7B-...
2026.03
21
Search-o1
Base Model=Qwen2.5-7B-...
2026.03
19.4
CoT
Base Model=Qwen2.5-7B-...
2026.03
12.8
Qwen2.5
Base Model=Qwen2.5-7B-...
2026.03
11.6
Feedback
Search any
task
Search any
task