Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Modeling on Fineweb-edu distillation 8B to 300M
Loading...
2.74
LM Loss
Random Sampling KD
2.7356
2.7653
2.795
2.8247
Mar 21, 2025
LM Loss
CE to FullKD Ratio
ECE
Updated 1mo ago
Evaluation Results
Method
Method
Links
LM Loss
CE to FullKD Ratio
ECE
Random Sampling KD
2025.03
2.74
100
0.9
FullKD
2025.03
2.75
100
0.7
Raman et al. (2023)
sampling method only=f...
2025.03
2.77
57
1.9
Peng et al. (2024)
sampling method only=true
2025.03
2.78
-47
1.7
Raman et al. (2023)
sampling method only=true
2025.03
2.8
5
2.2
CE
2025.03
2.81
0
1.2
Peng et al. (2024)
sampling method only=f...
2025.03
2.85
-78
1.4
Feedback
Search any
task
Search any
task