| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CAUSALWORLD space B | OILCA | Average Return891.14 | 7 | 3d ago | |
| CAUSALWORLD (in-distribution (Space A to Space A)) | ORIL | Average Return1,072.05 | 7 | 3d ago | |
| ERQA | InternVL3.5-20B-A4B | Score41.6 | 4 | 3d ago | |
| AI2D | InternVL3.5-20B-A4B | Score85.9 | 4 | 3d ago | |
| MathVista | InternVL3.5-20B-A4B | Score78 | 4 | 3d ago | |
| BLINK | InternVL3.5-20B-A4B | Score59 | 4 | 3d ago | |
| RealWorldQA | Score0.779 | 4 | 3d ago | ||
| MMLU-Pro (test) | GEPA | Accuracy83.76 | 4 | 3d ago | |
| NLVR2 | 1.7B K^8_{0.5} | Score83.2 | 3 | 3d ago | |
| M3Exam English | 1.7B K^8_{0.5} | Score58.6 | 3 | 3d ago | |
| VSR | 1.7B K^8_{0.5} | Score80.6 | 3 | 3d ago | |
| MMLU-Pro (test) | ETGPO | Optimization Token Usage (k)778 | 3 | 3d ago |