| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| GPQA Diamond | Accuracy84.4 | 64 | 3d ago | ||
| ScienceQA image | S-MFT | Accuracy80.4 | 53 | 2d ago | |
| GPQA-D | IOA | Accuracy (GPQA-D)14.43 | 20 | 2d ago | |
| SciRAG-SSLI hard 1.0 (test) | F1 Score46.86 | 19 | 3d ago | ||
| SciRAG-SSLI easy 1.0 (test) | RankGPT | F1 Score46.55 | 19 | 3d ago | |
| GPQA | CISPO | pass@118.2 | 18 | 3d ago | |
| GPQA Diamond (test) | Transformers | Pass@149 | 16 | 3d ago | |
| SciQA | Qwen3-VL-4B-Instruct | Accuracy92.7 | 13 | 2d ago | |
| Science & QA Domain Out-of-Domain | SampleQA Score3.19 | 11 | 3d ago | ||
| GPQA | AIM | Accuracy70.7 | 11 | 3d ago | |
| ScienceQA I | CVLM (3M IKPairs) w/o FKA | Accuracy69.96 | 8 | 3d ago | |
| ScienceQA | LLaVA-v1.6 (7B) w/ STIC | Accuracy75.3 | 7 | 2d ago | |
| Scientific Disciplines In-Domain | FT (tuned) | Chemistry Accuracy64.9 | 6 | 3d ago | |
| GPQA Diamond | Gemma-3-27b-it | pass@5092.42 | 2 | 3d ago | |
| GPQA Diamond (VeRA-E) | Avg@5 Accuracy (Seeds)79.27 | 1 | 3d ago |