| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SWE-Bench Lite | AIR | Resolution Rate73.5 | 16 | 4d ago | |
| Terminal Bench | Hybrid | Resolve Rate32.5 | 4 | 4d ago | |
| Multi-SWE-Bench Mini | Hybrid | Resolution Rate20 | 4 | 4d ago | |
| SWE-Bench Multilingual | Hybrid | Resolve Rate0.357 | 4 | 4d ago | |
| SWE-Bench Live Lite | Hybrid | Resolve Rate0.224 | 4 | 4d ago | |
| SWE-bench 500 (test) | agyn | Resolution Score72.2 | 4 | 4d ago |