Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Inference Efficiency on LongBench-E
Loading...
3.032
Prefill Time (sec)
Exact
3.00308
3.19829
3.3935
3.58871
Feb 11, 2025
Prefill Time (sec)
Decoding Time (sec)
Updated 24d ago
Evaluation Results
Method
Method
Links
Prefill Time (sec)
Decoding Time (sec)
Exact
2025.02
3.032
37.769
BalanceKV
compression rate=0.25,...
2025.02
3.662
38.054
StreamingLLM
compression rate=0.25
2025.02
3.681
40.276
PyramidKV
compression rate=0.25
2025.02
3.748
37.241
SnapKV
compression rate=0.25
2025.02
3.755
40.426
Feedback
Search any
task
Search any
task