Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Software Engineering on SWE-bench
Loading...
67.4
Resolve Rate
Qwen3.5-122B-A10B
-2.696
15.502
33.7
51.898
Mar 22, 2026
Mar 25, 2026
Mar 29, 2026
Apr 2, 2026
Apr 6, 2026
Apr 10, 2026
Apr 14, 2026
Resolve Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Resolve Rate
Qwen3.5-122B-A10B
Framework/Scaffold=Ope...
2026.04
67.4
Qwen3.5-122B-A10B
Framework/Scaffold=Ope...
2026.04
66.4
Qwen3.5-122B-A10B
Framework/Scaffold=Codex
2026.04
61.2
Nemotron 3 Super
Framework/Scaffold=Ope...
2026.04
60.47
Nemotron 3 Super
Framework/Scaffold=Ope...
2026.04
59.2
Devstral Small 2
Model size=24B, Model...
2026.04
56.4
Devin
Context=Best Published...
2026.03
55
Nemotron 3 Super
Framework/Scaffold=Codex
2026.04
53.73
GPT-OSS-120B
Framework/Scaffold=Ope...
2026.04
41.9
gpt-oss-20b
Model size=20B, Model...
2026.04
26
ARYA
Parameters=0
2026.03
0
Feedback
Search any
task
Search any
task