Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Task on LocalSearch
Loading...
40
Score
DR-Rubric-8B (Gemini)
33.968
35.534
37.1
38.666
May 31, 2026
Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Score
DR-Rubric-8B (Gemini)
Training=SFT+RL, 1K
2026.05
40
Qwen2.5-7B
2026.05
37.1
Qwen3-8B
2026.05
37.1
DR-Rubric-8B (GPT-5)
Training=SFT+RL, 1K
2026.05
37.1
WebExplorer-8B
Training=SFT+RL, 25K
2026.05
36.8
DR-Tulu-RL-8B
Training=SFT+RL, 25K
2026.05
36.5
DR-Tulu-SFT-8B
Training=SFT, 16K
2026.05
36.4
DR-Rubric-8B (BS-3)
Training=SFT+RL, 3K
2026.05
36.4
Qwen3-8B-SFT
Training=SFT, 1K
2026.05
36
Search-R1-7B
Training=RL, 90K
2026.05
34.2
Feedback
Search any
task
Search any
task