| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LLaMA-2 7B (inference) | JIT+CUDA | P99 Per-Token Latency (ms)8.23 | 33 | 1mo ago | |
| Code Assistant | DReSD | TPS104 | 20 | 3mo ago | |
| Decode Phase BS=1 | BWTA (Bitnet-b1.58-2B) | Latency (s)0.152 | 18 | 1mo ago | |
| FinanceQA | SpecBundle | Throughput1,779 | 18 | 2mo ago | |
| GPQA | SpecBundle | Throughput2,341.3 | 18 | 2mo ago | |
| Prefill Phase SeqLen=2k | BWTA (Bitnet-b1.58-2B) | Prefill Time (s)0.025 | 15 | 1mo ago | |
| Held-out datasets chatbot_instruction_prompts and finance-alpaca (test) | Aurora (Qwen3-Coder-Next (FP8)) | Throughput (TPS)265.7 | 14 | 3mo ago | |
| Qwen2.5-7B (test) | TAQ-IS | Throughput37.29 | 7 | 15d ago | |
| 8K context (test) | Q Score81.35 | 6 | 14d ago | ||
| Qwen3-0.6B (inference) | EDGERAZOR | Storage (GB)0.255 | 6 | 27d ago | |
| Llama 3.2 1B | TPOTH1.94 | 4 | 2mo ago | ||
| Synthetic heavy-tail workload Pareto distribution | BatchLLM | Throughput (req/s)17.02 | 2 | 1mo ago | |
| Llama 3.1 | RoME | Latency (ms)48.1 | 2 | 1mo ago |