| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | M3CoT | Accuracy61.2 | 56 | |
| Multimodal Reasoning | M3CoT (test) | Total Acc91.61 | 47 | |
| Multimodal Chain-of-Thought Reasoning | M3CoT | Accuracy50.5 | 42 | |
| General Visual Reasoning | M3CoT | Accuracy74.2 | 17 | |
| Multi-modal Reasoning | M3CoT | Accuracy66.94 | 12 | |
| General multimodal reasoning | M3CoT | Pass@1 Accuracy78.21 | 11 | |
| Visual Evidence Quality Evaluation | M3CoT Reasoning (subset of 500 samples) | AIM-CoT Win Rate76.4 | 2 |