Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Generation Efficiency on Qwen 14B (2048 input + 2048 generation tokens) 2.5
Loading...
70.3
End-to-end Latency (s)
GRIFFIN
69.736
73.543
77.35
81.157
May 8, 2025
End-to-end Latency (s)
Avg. Time to Next Token (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
End-to-end Latency (s)
Avg. Time to Next Token (ms)
GRIFFIN
Sparsity=50%, r=256, H...
2025.05
70.3
34
Caprese
Sparsity=50%, r=256, H...
2025.05
70.4
34
LoRA
Sparsity=50%, r=256, H...
2025.05
76.5
37
Full
Hardware=NVIDIA L40 GP...
2025.05
84.4
41
Feedback
Search any
task
Search any
task