| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| SQuAD | QZO | F1 Score88.3 | 52 | 21d ago | |
| DROP | SubZero-GV (Prefix) | F1 Score32.9 | 43 | 1mo ago | |
| Coding domain | L2P | Average Wall-Clock Time (s)4.14 | 40 | 8d ago | |
| QA domain | L2P | Average Generation Time (s)2.47 | 40 | 8d ago | |
| Math domain | L2P | Average Generation Time (s)2.86 | 40 | 8d ago | |
| MLLMU-Bench (Forget Set) | Rouge Score64.5 | 37 | 1d ago | ||
| DVisRAG (Real) | RobustVisRAG | Accuracy69.03 | 32 | 2mo ago | |
| DVisRAG Synthetic | RobustVisRAG | Accuracy57.87 | 32 | 2mo ago | |
| VisRAG Clean | RobustVisRAG | Accuracy67.33 | 32 | 2mo ago | |
| ViDoRe AI v1 (test) | MO-PGD | ASR-G Similarity (mean) @-11 | 23 | 2mo ago | |
| Big-Bench Hard (test) | FLAN-PaLM 540B | Exact Match57.9 | 17 | 3mo ago | |
| UniICL-Bench | UniICL | Perception86.5 | 15 | 2mo ago | |
| VPTT-Bench 1.0 (test) | Comb. (BRAG + VPRAG) | VPTT Score (Novelty Adjusted)0.644 | 15 | 3mo ago | |
| Arabic Creative Writing Evaluation Dataset (test) | LLaMA-3-8B | BERT-Score81.6 | 12 | 1mo ago | |
| Charge | REPA-P | Data Metric Value0.0081 | 10 | 12d ago | |
| Darcy Flow | REPA-P | Data Fidelity Deviation1.19 | 10 | 12d ago | |
| ViDoRe AI Targeted Setting III V1 | MO-PGD | ASR-G Sim^-1 (Mean)93 | 10 | 2mo ago | |
| CIFAR 16x16 | NegBio-VAE | FID (5k)40.2788 | 8 | 1mo ago | |
| MNIST | NegBio-VAE | FID@5k79.6727 | 8 | 1mo ago | |
| Torus RNA (test) | RMF-MT | MMD0.07 | 7 | 2mo ago | |
| Torus Protein PrePro (test) | G-LSD | MMD0.05 | 7 | 2mo ago | |
| Torus Protein Proline (test) | G-LSD | MMD0.04 | 7 | 2mo ago | |
| Torus Protein Glycine (test) | G-LSD | MMD0.03 | 7 | 2mo ago | |
| Torus Protein General (test) | RCT | MMD0.01 | 7 | 2mo ago | |
| OPT 1.3B Generation | SQuAD Score80.4 | 6 | 1mo ago |