| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Reasoning | R1-Onevision-Bench (Overall) | Accuracy39.2 | 23 | |
| Multimodal Reasoning | R1-Onevision-Bench Physics | Accuracy34.9 | 8 | |
| Multimodal Reasoning | R1-Onevision-Bench Math | Accuracy25.7 | 8 | |
| Multimodal Reasoning | R1-Onevision-Bench Deduction | Accuracy27.8 | 8 |