Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Search on LongSeAL
Loading...
13.5
String-F1
Vanilla-Qwen3-32B
5.7
7.725
9.75
11.775
Apr 6, 2026
String-F1
Updated 11d ago
Evaluation Results
Method
Method
Links
String-F1
Vanilla-Qwen3-32B
ASP=false
2026.04
13.5
Vanilla-Qwen3-8B
ASP=false
2026.04
10.8
SFT-Qwen3-1.7B
ASP=true
2026.04
10.1
SFT-Llama-3.2-3B
ASP=true
2026.04
10.1
Distilled-Llama3.2-3B
ASP=false
2026.04
9.8
SFT-Llama-3.2-1B
ASP=true
2026.04
9.3
Vanilla-Qwen3-1.7B
ASP=false
2026.04
9
OPD-Qwen3-1.7B
ASP=true
2026.04
8.6
Mixed-Qwen3-1.7B
ASP=true
2026.04
8.3
Vanilla-Qwen3-4B
ASP=false
2026.04
8.2
Vanilla-Qwen3-0.6B
ASP=false
2026.04
7.4
SFT-Qwen3-0.6B
ASP=true
2026.04
7
Distilled-Qwen3-1.7B
ASP=false
2026.04
6
Distilled-Llama3.2-1B
ASP=false
2026.04
6
Feedback
Search any
task
Search any
task