| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RTX 3090 24GB (inference) | Layer-Condensed KV Cache | Max Batch Size1,150 | 24 | 1mo ago | |
| Llama-2-7B | Protocol 2 (SIGMA) | Latency (LAN)22.1 | 12 | 1mo ago | |
| A100 80GB (inference) | Layer-Condensed KV Cache | Maximum Batch Size128 | 6 | 1mo ago | |
| Synthetic | H2O (20%) | Latency (s)50.4 | 6 | 1mo ago | |
| SpecBench | Eagle2 | Tokens/s116.95 | 3 | 1mo ago |