Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OpenR1

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningOpenR1 Math 220k
Composite Score0.8633
20
Language ModelingOpenR1-Math
Perplexity (PPL)2.86
15
Math ReasoningOpenR1
Pass@872
14
Language ModelingOpenR1
Perplexity (PPL)2.64
11
Showing 4 of 4 rows