Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Loss curve fitting across model sizes on MoE models (various sizes)
Loading...
0.341
ASMT MAPE
ASMT
0.31636
0.48268
0.649
0.81532
Dec 5, 2025
ASMT MAPE
CMMT (λ=0.999) MAPE
CMMT (λ=0.99) MAPE
Updated 1mo ago
Evaluation Results
Method
Method
Links
ASMT MAPE
CMMT (λ=0.999) MAPE
CMMT (λ=0.99) MAPE
ASMT
LRmax=1e-4, Batch Size...
2025.12
0.341
-
-
ASMT
LRmax=2e-3, Batch Size...
2025.12
0.574
-
-
ASMT
LRmax=2e-4, Batch Size...
2025.12
0.957
-
-
CMMT (λ = 0.999)
LRmax=2e-4, Batch Size...
2025.12
-
1.039
-
CMMT (λ = 0.99)
LRmax=2e-4, Batch Size...
2025.12
-
-
0.75
CMMT (λ = 0.999)
LRmax=2e-3, Batch Size...
2025.12
-
0.487
-
CMMT (λ = 0.99)
LRmax=2e-3, Batch Size...
2025.12
-
-
0.525
CMMT (λ = 0.999)
LRmax=1e-4, Batch Size...
2025.12
-
0.373
-
CMMT (λ = 0.99)
LRmax=1e-4, Batch Size...
2025.12
-
-
0.356
Feedback
Search any
task
Search any
task