| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Graph Coloring (test) | HyperGuide | Accuracy88 | 21 | 8d ago | |
| N-Queens N=8 (test) | HyperGuide | Accuracy27.2 | 21 | 8d ago | |
| Gears World | Run Time (s)0.02 | 17 | 22d ago | ||
| Sudoku | DiffThinker | CSP Result Index 3557 | 12 | 3mo ago | |
| SudokuBench | Gemini 2.5 Pro+ASP | Accuracy74.7 | 10 | 1mo ago | |
| ZL-XXL | Gemini 2.5 Pro+ASP | Accuracy (%)97.7 | 10 | 1mo ago | |
| ZL-XL | DS-R1-0528+ASP | Accuracy97.3 | 10 | 1mo ago | |
| Sudoku 4x4 | ActFocus | Success Rate95.7 | 4 | 19d ago | |
| Educational Tutoring Environment | Progress Score (C2)53.8 | 4 | 1mo ago | ||
| Short-generation task | ICFA | Accuracy74.3 | 4 | 3mo ago |