Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Speculative Decoding on Pre-training dataset
Loading...
0.658
Speculative Accept %
Full Knowledge Distillation
0.64552
0.64876
0.652
0.65524
Mar 21, 2025
Speculative Accept %
Updated 1mo ago
Evaluation Results
Method
Method
Links
Speculative Accept %
Full Knowledge Distillation
Training Tokens=100B,...
2025.03
0.658
Random Sampling KD
Training Tokens=100B,...
2025.03
0.657
Cross-Entropy
Training Tokens=100B,...
2025.03
0.646
Feedback
Search any
task
Search any
task