| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ToolAlpaca | MTA | Accuracy97.42 | 20 | 4d ago | |
| API-Bench | MTA | Accuracy96.3 | 20 | 4d ago | |
| SD Single-domain | MTA | Accuracy94.04 | 20 | 4d ago | |
| CD Cross-domain | MTA | Accuracy97.33 | 20 | 4d ago | |
| Chess Skill: beginner, intermediate, advanced | Accuracy100 | 10 | 4d ago | ||
| Chess Specialists: opening, midgame, endgame, late-endgame | Accuracy64.4 | 10 | 4d ago | ||
| Trace-based setting | Trace-based | Improvement6.8 | 4 | 4d ago | |
| All Tasks | ITR | Tools Correct82 | 4 | 4d ago | |
| Seal-Tools (test) | SARL | Top-1 Acc99.9 | 2 | 4d ago | |
| GUI-360° (test) | SARL | Top-1 Accuracy61.9 | 2 | 4d ago |