| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Aggregate Mean over Alpaca, CodeAlpaca, HumanEval, LiveCodeBench, Math500, MBPP, MT-Bench | DART | Mean Speedup2.87 | 21 | 4d ago | |
| MBPP | DART | Speedup3.09 | 21 | 4d ago | |
| Math500 | DART | Speedup2.84 | 21 | 4d ago | |
| LiveCodeBench | DART | Speedup2.81 | 21 | 4d ago | |
| Alpaca | DART | Speedup2.95 | 21 | 4d ago | |
| Transformers (PyTorch) workflow Qwen3 family (inference) | CryptoTensors | Model Load Time (s)1.16 | 18 | 4d ago | |
| ToolBench | AugServe | Goodput (req/s)3.9 | 18 | 4d ago | |
| Merge | AugServe | Goodput (req/s)1.16 | 18 | 4d ago | |
| ToolBench dataset | AugServe | SLO Attainment100 | 9 | 4d ago | |
| Merge dataset | AugServe | SLO Attainment54.3 | 9 | 4d ago | |
| Long-Context LLM Inference Decode | Latency (ms)0.13 | 8 | 4d ago | ||
| Alpaca, CodeAlpaca, HumanEval, LiveCodeBench, Math500, MBPP, and MT-Bench | DART | Speedup (Alpaca)2.61 | 8 | 4d ago | |
| Long-Context LLM Inference (Prefill) | Kascade | Prefill Latency (ms)0.62 | 6 | 4d ago | |
| LLaMA-2 70B sequence length 2048 | CXL-SpecKV + Comp | Max Batch Size384 | 5 | 4d ago |