Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Issue Resolution on SWE-Bench Sphinx Verified (test)
Loading...
43.51
Accuracy
GLM-4.5-Air
36.8852
38.6051
40.325
42.0449
Jan 28, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GLM-4.5-Air
Context window=32K, Ro...
2026.01
43.51
Devstral-Small-2-24B
Context window=32K
2026.01
38.95
SERA-32B-Sphinx
Context window=32K, Ba...
2026.01
37.14
Feedback
Search any
task
Search any
task