Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Language Model Inference on stories100m 110M parameters
Loading...
298.7
Tokens/s
PyTorch (Accelerate)
13.012
87.181
161.35
235.519
Jan 6, 2026
Tokens/s
Latency (ms)
Updated 4d ago
Evaluation Results
Method
Method
Links
Tokens/s
Latency (ms)
PyTorch (Accelerate)
Optimization=AMX Copro...
2026.01
298.7
3.3
bare_metal::Transformer
Optimization=NEON Manual
2026.01
61.3
16.3
Scalar C++
Optimization=-O3 Auto-...
2026.01
24
41.6
Feedback
Search any
task
Search any
task