Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent on SWE-bench Verified
Loading...
72.1
Accuracy
DeepSeek V3.2
59.932
63.091
66.25
69.409
Dec 30, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
DeepSeek V3.2
Evaluation Mode=Chat
2025.12
72.1
GLM 4.6
Evaluation Mode=Chat
2025.12
68
LongCat-Flash Exp-Chat
Evaluation Mode=Chat
2025.12
63.2
LongCat-Flash Chat
Evaluation Mode=Chat
2025.12
60.4
Feedback
Search any
task
Search any
task