| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Scale AI Multi-Challenge | Qwen3.5-122B-A10B | Pass@161.5 | 7 | 4d ago | |
| ArenaHard Hard Prompt v2 | Nemotron-Cascade-2 30B-A3B | Pass@188.2 | 4 | 29d ago | |
| ArenaHard Creative Writing v2 | Nemotron-Cascade-2 30B-A3B | Pass@178.7 | 3 | 29d ago | |
| ArenaHard Avg. v2 | Nemotron-Cascade-2 30B-A3B | Pass@183.5 | 3 | 29d ago |