Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Language Modeling on C4 LLaMA-60M (val)
Loading...
28.53
Perplexity
FOAM-2
28.0736
31.1543
34.235
37.3157
Dec 8, 2025
Perplexity
Memory (GB)
Updated 4d ago
Evaluation Results
Method
Method
Links
Perplexity
Memory (GB)
FOAM-2
Training Tokens=1.3B,...
2025.12
28.53
0.27
FOAM-3
Training Tokens=1.3B,...
2025.12
28.79
0.25
MUON
Training Tokens=1.3B,...
2025.12
28.93
0.3
Full-Adam
Training Tokens=1.3B,...
2025.12
29.57
0.34
Adam-Mini
Training Tokens=1.3B,...
2025.12
29.63
0.22
FOAM-Mini
Training Tokens=1.3B,...
2025.12
29.71
0.24
APOLLO-1/4
Training Tokens=1.3B,...
2025.12
31.18
0.28
APOLLO-1/8
Training Tokens=1.3B,...
2025.12
31.53
0.26
APOLLO-Mini
Training Tokens=1.3B,...
2025.12
31.58
0.24
GWT-Mini
Training Tokens=1.3B,...
2025.12
32.94
0.24
GaLore-1/4
Training Tokens=1.3B,...
2025.12
34.38
0.28
GaLore-1/8
Training Tokens=1.3B,...
2025.12
39.94
0.26
Feedback
Search any
task
Search any
task