Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Pre-training efficiency on Pre-training
Loading...
4,228
Muon Steps
Muon
3,017.44
3,331.72
3,646
3,960.28
Feb 25, 2026
Muon Steps
Muon+ Steps
Speed-up (%)
Updated 19d ago
Evaluation Results
Method
Method
Links
Muon Steps
Muon+ Steps
Speed-up (%)
Muon
Model=LLaMA-130M, Targ...
2026.02
4,228
-
-
Muon
Model=GPT-Large, Targe...
2026.02
3,710
-
-
Muon
Model=GPT-Base, Target...
2026.02
3,447
-
-
Muon
Model=LLaMA-350M, Targ...
2026.02
3,064
-
-
MUON+
Model=LLaMA-130M, Targ...
2026.02
-
3,448
22.6
MUON+
Model=LLaMA-350M, Targ...
2026.02
-
2,374
29.1
MUON+
Model=GPT-Base, Target...
2026.02
-
2,515
37.1
MUON+
Model=GPT-Large, Targe...
2026.02
-
3,032
22.4
Feedback
Search any
task
Search any
task