Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Generation Efficiency on Qwen 2.5 14B (2048 input + 8192 generation tokens)
Loading...
287.7
End-to-end Latency (s)
GRIFFIN
285.412
300.856
316.3
331.744
May 8, 2025
End-to-end Latency (s)
Avg. Time to Next Token (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
End-to-end Latency (s)
Avg. Time to Next Token (ms)
GRIFFIN
Sparsity=50%, r=256, H...
2025.05
287.7
35
Caprese
Sparsity=50%, r=256, H...
2025.05
288.8
35
LoRA
Sparsity=50%, r=256, H...
2025.05
312.5
38
Full
Hardware=NVIDIA L40 GP...
2025.05
344.9
42
Feedback
Search any
task
Search any
task