Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Software Engineering on Commit0-Lite
Loading...
59.5
Score
Claude Sonnet 4.5
41.612
46.256
50.9
55.544
Mar 23, 2026
Score
Runtime
Cost
Updated 25d ago
Evaluation Results
Method
Method
Links
Score
Runtime
Cost
Claude Sonnet 4.5
SDK=v1.11.0, Agent Con...
2026.03
59.5
2,275.8
10.1
Claude Sonnet 4.5
SDK=v1.11.0, Agent Con...
2026.03
59.1
1,583.2
8.1
MiniMax 2.5
SDK=v1.11.0, Agent Con...
2026.03
57
1,908.6
4.5
MiniMax 2.5
SDK=v1.11.0, Agent Con...
2026.03
57
2,660.7
6.1
Claude Sonnet 4.5
SDK=v1.11.0, Agent Con...
2026.03
53.1
692.6
1.9
GLM 4.7
SDK=v1.11.0, Agent Con...
2026.03
46.5
1,387.8
7.3
GLM 4.7
SDK=v1.11.0, Agent Con...
2026.03
46.5
2,257.8
9.8
GLM 4.7
SDK=v1.11.0, Agent Con...
2026.03
42.8
871
2.5
MiniMax 2.5
SDK=v1.11.0, Agent Con...
2026.03
42.3
752.1
1.6
Feedback
Search any
task
Search any
task