Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Search on Frames
Loading...
36.6
String-F1
Vanilla-Qwen3-32B
7.792
15.271
22.75
30.229
Apr 6, 2026
String-F1
Updated 11d ago
Evaluation Results
Method
Method
Links
String-F1
Vanilla-Qwen3-32B
ASP=false
2026.04
36.6
Vanilla-Qwen3-8B
ASP=false
2026.04
30.2
Vanilla-Qwen3-4B
ASP=false
2026.04
28.4
SFT-Qwen3-1.7B
ASP=true
2026.04
25.2
Mixed-Qwen3-1.7B
ASP=true
2026.04
25
SFT-Llama-3.2-3B
ASP=true
2026.04
24.8
OPD-Qwen3-1.7B
ASP=true
2026.04
24.6
SFT-Qwen3-0.6B
ASP=true
2026.04
21
SFT-Llama-3.2-1B
ASP=true
2026.04
18.6
Distilled-Qwen3-1.7B
ASP=false
2026.04
16.7
Distilled-Llama3.2-3B
ASP=false
2026.04
16.4
Vanilla-Qwen3-1.7B
ASP=false
2026.04
15.2
Distilled-Llama3.2-1B
ASP=false
2026.04
11.4
Vanilla-Qwen3-0.6B
ASP=false
2026.04
8.9
Feedback
Search any
task
Search any
task