Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Pretraining on 500-shard corpus (val)
Loading...
0.758
Validation BPB
Opus
0.74952
0.80676
0.864
0.92124
Mar 31, 2026
Validation BPB
Updated 5d ago
Evaluation Results
Method
Method
Links
Validation BPB
Opus
vs. baseline=−22%, Cos...
2026.03
0.758
Sonnet
vs. baseline=−10%, Cos...
2026.03
0.869
GPT-5.2
Cost=$150
2026.03
0.97
Feedback
Search any
task
Search any
task