| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Real-world perception-centric reasoning suite (test) | GLM-9B-DeltaThinker | Average Score55.53 | 8 | 16d ago | |
| V* (test) | GLM-9B-DeltaThinker | Accuracy84.25 | 8 | 16d ago | |
| HRBench 4K (test) | GLM-9B-DeltaThinker | Accuracy80.25 | 8 | 16d ago | |
| RealWorldQA (test) | GLM-9B-DeltaThinker | Accuracy77.04 | 8 | 16d ago |