Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Coding on SWE-bench Multilingual
Loading...
77.5
Score
Claude Opus 4.5
64.5
67.875
71.25
74.625
Feb 17, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Claude Opus 4.5
Framework=OpenHands
2026.02
77.5
GLM-5
Framework=OpenHands
2026.02
73.3
Kimi K2.5
Framework=OpenHands
2026.02
73
GPT-5.2 (xhigh)
Framework=OpenHands
2026.02
72
DeepSeek-V3.2
Framework=OpenHands
2026.02
70.2
GLM-4.7
Framework=OpenHands
2026.02
66.7
Gemini 3 Pro
Framework=OpenHands
2026.02
65
Feedback
Search any
task
Search any
task