Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Language Modeling on C4 LLaMA-350M (val)
Loading...
15.87
Perplexity
FOAM-2
15.6412
17.1856
18.73
20.2744
Dec 8, 2025
Perplexity
Memory (G)
Updated 4d ago
Evaluation Results
Method
Method
Links
Perplexity
Memory (G)
FOAM-2
Training Tokens=7.8B,...
2025.12
15.87
1.3
FOAM-3
Training Tokens=7.8B,...
2025.12
15.94
1.14
FOAM-Mini
Training Tokens=7.8B,...
2025.12
16.53
1
APOLLO-1/4
Training Tokens=7.8B,...
2025.12
16.73
1.38
MUON
Training Tokens=7.8B,...
2025.12
16.96
1.6
APOLLO-1/8
Training Tokens=7.8B,...
2025.12
16.98
1.23
APOLLO-Mini
Training Tokens=7.8B,...
2025.12
17.17
1
Full-Adam
Training Tokens=7.8B,...
2025.12
17.33
2.2
Adam-Mini
Training Tokens=7.8B,...
2025.12
17.83
1.46
GWT-Mini
Training Tokens=7.8B,...
2025.12
18.12
1
GaLore-1/4
Training Tokens=7.8B,...
2025.12
19.36
1.38
GaLore-1/8
Training Tokens=7.8B,...
2025.12
21.59
1.23
Feedback
Search any
task
Search any
task