Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context understanding on L-Eval 32K
Loading...
215
P95 Latency (ms)
TTKV
208.2
254.1
300
345.9
Mar 27, 2026
P95 Latency (ms)
Memory Footprint (GB)
Accuracy (%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
P95 Latency (ms)
Memory Footprint (GB)
Accuracy (%)
TTKV
Model=Mistral-7B
2026.03
215
0.95
59.1
TTKV
Model=Llama-3.1-8B
2026.03
245
1.05
65
KIVI
Model=Mistral-7B, Quan...
2026.03
325
2.75
58.8
KVQuant
Model=Mistral-7B, Quan...
2026.03
332
2.95
59
DiffKV
Model=Mistral-7B
2026.03
338
3.05
58.8
FP16
Model=Mistral-7B
2026.03
340
3.2
59.2
ShadowKV
Model=Mistral-7B
2026.03
345
3.15
58.9
KIVI
Model=Llama-3.1-8B, Qu...
2026.03
362
3.1
64.9
KVQuant
Model=Llama-3.1-8B, Qu...
2026.03
370
3.35
65.2
DiffKV
Model=Llama-3.1-8B
2026.03
378
3.45
65.1
FP16
Model=Llama-3.1-8B
2026.03
380
3.6
65.4
ShadowKV
Model=Llama-3.1-8B
2026.03
385
3.55
65.3
Feedback
Search any
task
Search any
task