Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Coding on SWE-Bench Verified (Pass@1)
Loading...
79.6
Pass@1
Claude Sonnet 4.6
50.896
58.348
65.8
73.252
Mar 21, 2026
Apr 1, 2026
Apr 12, 2026
Apr 23, 2026
May 4, 2026
May 15, 2026
May 26, 2026
Pass@1
Updated 6d ago
Evaluation Results
Method
Method
Links
Pass@1
Claude Sonnet 4.6
# Total Params=-, # Ac...
2026.05
79.6
DeepSeek-V4 Flash
# Total Params=284B, #...
2026.05
79
Claude Sonnet-4.5
2026.03
77.2
Gemini 3-pro
2026.03
76.2
Qwen3.5
# Total Params=397B, #...
2026.05
76.2
GPT-5 High
2026.03
74.9
LAGUNA M.1
# Total Params=225B, #...
2026.05
74.6
GLM-4.7
# Total Params=355B, #...
2026.05
73.8
Qwen3.6
# Total Params=35B, #...
2026.05
73.4
Claude Haiku 4.5
2026.05
73.3
Seed1.8
2026.03
72.9
Devstral 2
# Total Params=123B, #...
2026.05
72.2
LAGUNA XS.2
# Total Params=33.4B,...
2026.05
69.9
Qwen3.5
# Total Params=35B, #...
2026.05
69.2
Devstral Small 2
# Total Params=24B, #...
2026.05
68
Gemini 2.5-pro
2026.03
59.6
Gemma 4
# Total Params=31B, #...
2026.05
52
Feedback
Search any
task
Search any
task