Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Search on BrowseComp (String-F1)
Loading...
21
String-F1
Vanilla-Qwen3-32B
2.488
7.294
12.1
16.906
Apr 6, 2026
String-F1
Updated 11d ago
Evaluation Results
Method
Method
Links
String-F1
Vanilla-Qwen3-32B
ASP=false
2026.04
21
Vanilla-Qwen3-8B
ASP=false
2026.04
16
Distilled-Qwen3-1.7B
ASP=false
2026.04
10.1
SFT-Qwen3-1.7B
ASP=true
2026.04
10
SFT-Llama-3.2-3B
ASP=true
2026.04
10
SFT-Qwen3-0.6B
ASP=true
2026.04
9
Mixed-Qwen3-1.7B
ASP=true
2026.04
8.8
OPD-Qwen3-1.7B
ASP=true
2026.04
8.4
SFT-Llama-3.2-1B
ASP=true
2026.04
8.2
Vanilla-Qwen3-4B
ASP=false
2026.04
7
Distilled-Llama3.2-1B
ASP=false
2026.04
5.1
Vanilla-Qwen3-0.6B
ASP=false
2026.04
4.6
Distilled-Llama3.2-3B
ASP=false
2026.04
3.5
Vanilla-Qwen3-1.7B
ASP=false
2026.04
3.2
Feedback
Search any
task
Search any
task