| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TESTEVAL | Syntax Correctness95.64 | 64 | 1mo ago | ||
| UnLeakedTestBench | Code-A1 | Pass@136.79 | 12 | 1mo ago | |
| TDD-Bench Verified (test) | SWE-SPOT | Pass Rate22.75 | 9 | 1mo ago | |
| CAV-Gym n=5 (test) | Q-Learning | Accuracy (%)40.2 | 6 | 1mo ago | |
| CAV-Gym n=1 (test) | Q-Learning | Accuracy (%)16 | 6 | 1mo ago |