Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Generation Efficiency on Qwen 2.5 14B (2048 input + 256 generation tokens)
Loading...
8.7
End-to-end Latency (s)
GRIFFIN
8.628
9.114
9.6
10.086
May 8, 2025
End-to-end Latency (s)
Avg. Time to Next Token (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
End-to-end Latency (s)
Avg. Time to Next Token (ms)
GRIFFIN
Sparsity=50%, r=256, H...
2025.05
8.7
34
Caprese
Sparsity=50%, r=256, H...
2025.05
8.7
34
LoRA
Sparsity=50%, r=256, H...
2025.05
9.5
37
Full
Hardware=NVIDIA L40 GP...
2025.05
10.5
41
Feedback
Search any
task
Search any
task