| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ScienceQA IMG | EMOVA | Accuracy98.2 | 256 | 2d ago | |
| ARC Challenge | Qwen-3-30B-A3B | Accuracy96 | 234 | 2d ago | |
| ScienceQA | LongVILA-7B (S3) | Accuracy98.5 | 229 | 2d ago | |
| ScienceQA (test) | PVC InternVL2 (Ours) | Average Accuracy97.7 | 208 | 2d ago | |
| ARC-E | Qwen3-4B | Accuracy97.53 | 138 | 3d ago | |
| ScienceQA (SQA) | Qwen2.5-VL | Accuracy88.8 | 128 | 2d ago | |
| ARC-C | Accuracy96.3 | 127 | 3d ago | ||
| ScienceQA SQA-IMG | TroL | Accuracy92.8 | 114 | 2d ago | |
| ARC Easy | Accuracy96.95 | 101 | 2d ago | ||
| GPQA | M2CL | pass@1 Accuracy87.6 | 85 | 3d ago | |
| ScienceQA SQA-I | InternVL2-8B + RP | Accuracy96.6 | 81 | 2d ago | |
| SciQA-IMG | Phi 3.5 Vision | SciQA-IMG Accuracy89 | 53 | 3d ago | |
| ScienceQA | Qwen2.5-VL-7B-Instruct | IMG Score88.6 | 49 | 3d ago | |
| ScienceQA IMG (test) | GPT-4V-1106 | Accuracy82.1 | 45 | 3d ago | |
| SciQ | Llama 3-8B E8T2 | Normalized Accuracy97 | 44 | 3d ago | |
| ScienceQA Image | FastV | Score74.2 | 38 | 2d ago | |
| GPQA | RLTR | Accuracy34.8 | 28 | 3d ago | |
| GPQA D | CGRS | Accuracy65.2 | 28 | 3d ago | |
| ScienceQA v1.0 (test) | AUTOACT | Accuracy (G1-4)82.5 | 26 | 3d ago | |
| ARC Easy | Qwen-1.5 14B | Accuracy87.58 | 26 | 3d ago | |
| ARC-c (test) | Trinity Large (MoE) | Accuracy90 | 25 | 3d ago | |
| GPQA | AC | Memory Ratio0.21 | 24 | 3d ago | |
| GPQA (test) | ExpRAG | Accuracy65.7 | 24 | 3d ago | |
| ScienceQA | Exact Match (EM)88.45 | 23 | 3d ago | ||
| SQA IMG | TopV | Score97.67 | 23 | 3d ago |