Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Inference Efficiency on Synthetic LLM Workload (4K Input/4K Output)
Loading...
160.32
Latency (s)
ChunkKV
159.7128
163.8114
167.91
172.0086
Feb 4, 2025
Latency (s)
Throughput (T/S)
Updated 23d ago
Evaluation Results
Method
Method
Links
Latency (s)
Throughput (T/S)
ChunkKV
Input Sequence Length=...
2025.02
160.32
41.3
ShotKV
Input Sequence Length=...
2025.02
162.85
41.12
SnapKV
Input Sequence Length=...
2025.02
163.45
40.51
FullKV
Input Sequence Length=...
2025.02
175.5
37.73
Feedback
Search any
task
Search any
task