Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Computer Science Problem Solving on FrontierCS
Loading...
19.82
Avg@5
FrontierSmith (200)
1.0792
5.9446
10.81
15.6754
May 14, 2026
Avg@5
Best@5
Updated 19d ago
Evaluation Results
Method
Method
Links
Avg@5
Best@5
FrontierSmith (200)
Model=Qwen3.5-27B
2026.05
19.82
29.38
FrontierCS (172)
Model=Qwen3.5-27B
2026.05
13.98
21.92
HardTests (200)
Model=Qwen3.5-27B
2026.05
11.2
18.37
FrontierCS (172)
Model=Qwen3.5-9B
2026.05
11.17
16.29
ALE-bench (40)
Model=Qwen3.5-9B
2026.05
10.64
15.7
FrontierSmith (200)
Model=Qwen3.5-9B
2026.05
10.62
15.73
Base
Model=Qwen3.5-27B
2026.05
7.7
13.91
HardTests (200)
Model=Qwen3.5-9B
2026.05
5.38
11.19
Random Reward (172)
Model=Qwen3.5-9B
2026.05
3.04
6.93
Base
Model=Qwen3.5-9B
2026.05
1.8
5
Feedback
Search any
task
Search any
task