| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Faithful Reasoning | SPD-Faith Bench Multi-Difference 1.0 (test) | CR87.7 | 12 | |
| Faithful Perception | SPD-Faith Bench Multi-Difference Subset 1.0 (test) | Precision (Type: Color)97.5 | 12 | |
| Global Perception | SPD-Faith Bench Multi-Difference 1.0 (test) | DQR Score68.3 | 12 | |
| Multimodal Reasoning | SPD-Faith Bench Hard 1.0 | Contradiction Rate10.9 | 12 | |
| Multimodal Reasoning | SPD-Faith Bench Medium 1.0 | Contradiction Rate11.3 | 12 | |
| Multimodal Reasoning | SPD-Faith Bench Easy 1.0 | Contradiction Rate5 | 12 |