Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Software Engineering on SWE-rebench January 2026 (test)
Loading...
52.9
Resolved Rate
Claude Opus 4.6
37.3
41.35
45.4
49.45
Feb 17, 2026
Resolved Rate
Resolved Rate SEM
Pass@5
Updated 4d ago
Evaluation Results
Method
Method
Links
Resolved Rate
Resolved Rate SEM
Pass@5
Claude Opus 4.6
2026.02
52.9
1.06
70.8
GPT-5.2
config=xhigh
2026.02
51.7
1.21
58.3
Claude Sonnet 4.5
2026.02
47.1
1.69
60.4
Gemini 3 Pro
2026.02
46.7
2.04
58.3
Claude Opus 4.5
2026.02
43.8
0.93
58.3
GLM-5
2026.02
42.1
1.21
50
GLM-4.7
2026.02
41.3
2.12
56.3
Kimi K2.5
2026.02
37.9
1.21
50
Feedback
Search any
task
Search any
task