| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| gsm8k | DDTree | Average Generation Length (τ)9.27 | 81 | 22h ago | |
| LiveCodeBench | TAPS | Speedup Factor7.16 | 66 | 1d ago | |
| Spec-Bench | FR-Spec | MT Score195.6 | 57 | 6d ago | |
| MT-bench | DDTree | Tau (τ)6.06 | 53 | 22h ago | |
| GPQA Diamond | STAND | Throughput91.17 | 48 | 12d ago | |
| AIME 2024 | STAND | Throughput (T)69.15 | 48 | 12d ago | |
| SpecBench | MicroSpec | AVG SR900.7 | 47 | 7d ago | |
| HumanEval | DDTree | Tau (τ)9.65 | 36 | 22h ago | |
| SQL | HedgeSpec | MAT8.06 | 30 | 1mo ago | |
| CNN_DM | LLaMA-3.1-8B-IT | MAT1 | 30 | 1mo ago | |
| MedQA | HedgeSpec | Match Rate (MAT)6.47 | 30 | 1mo ago | |
| Chemistry | HedgeSpec | MAT7.1 | 30 | 1mo ago | |
| Biology | HedgeSpec | MAT7.18 | 30 | 1mo ago | |
| Math | HedgeSpec | Match Rate7.69 | 30 | 1mo ago | |
| Python | HedgeSpec | MAT7.69 | 30 | 1mo ago | |
| MT-Bench, HumanEval, and GSM8K Mean | MTP Lλ LK | Mean Acceptance Length (tau)4.83 | 26 | 22h ago | |
| AIME 25 | TAPS | Speedup7.08 | 26 | 1d ago | |
| Avg. | TAPS | Speedup6.73 | 24 | 1d ago | |
| MBPP | TAPS | Speedup6.75 | 24 | 1d ago | |
| MATH 500 | TAPS | Speedup7.9 | 24 | 1d ago | |
| Med | EvoSpec | Throughput (tokens/s)128.51 | 22 | 6d ago | |
| Law | EvoSpec | Throughput (tokens/s)132.69 | 22 | 6d ago | |
| Code | EvoSpec | Throughput (tokens/s)138.72 | 22 | 6d ago | |
| MMSPEC 1.0 (test) | MSD | GQA Speedup2.27 | 22 | 2mo ago | |
| 20 Prompts across 4 Task Categories | SpecKV-acc. | Mean Expected Tokens per Speculation Step6.55 | 20 | 29d ago |