Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Coding on SWE-bench Pro
Loading...
52.6
Pass@1
DeepSeek-V4 Flash
35.024
39.587
44.15
48.713
May 26, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
DeepSeek-V4 Flash
# Total Params=284B, #...
2026.05
52.6
GPT-5.4 Nano
2026.05
52.4
Qwen3.5
# Total Params=397B, #...
2026.05
50.9
Qwen3.6
# Total Params=35B, #...
2026.05
49.5
LAGUNA M.1
# Total Params=225B, #...
2026.05
49.2
LAGUNA XS.2
# Total Params=33.4B,...
2026.05
46.3
Qwen3.5
# Total Params=35B, #...
2026.05
44.6
Claude Haiku 4.5
2026.05
39.5
Gemma 4
# Total Params=31B, #...
2026.05
35.7
Feedback
Search any
task
Search any
task