| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Llama 70B 3.1 (inference) | DeepFusionKernel | Throughput1,410.39 | 21 | 3mo ago | |
| RTX 6000 Ada platform | LAQuant | Throughput (tokens/sec)468.7 | 20 | 22d ago | |
| RTX 4090 platform | LAQuant | Throughput (tokens/sec)426 | 20 | 22d ago | |
| DGX A100 platform | LAQuant | Throughput (tokens/sec)359.3 | 20 | 22d ago | |
| Multi-task Evaluation Suite Llama-3.2-1B (test) | FR-Spec | MT Throughput (token/s)394.81 | 6 | 3mo ago | |
| Common Numeracy Benchmarks | NTL-WAS | RMSE ([0, 10^2])0.3 | 5 | 2mo ago | |
| workload 64 KiB | base-m-len | Throughput (MiB/s)669.1 | 4 | 2mo ago | |
| workload 1 KiB | base-m-len | Throughput (MiB/s)640.5 | 4 | 2mo ago | |
| HQC-256 | NPU-supported | Latency (us/decode)116.333 | 2 | 1d ago | |
| HQC-192 | NPU-supported | Latency (us/decode)56.065 | 2 | 1d ago | |
| HQC-128 | NPU-supported | Latency (us/decode)39.173 | 2 | 1d ago |