Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on OlympiadBench (accuracy)

0.7659Accuracy

JustRL-Nemotron

0.0837640.2608570.437950.615043Nov 28, 2025Dec 17, 2025Jan 5, 2026Jan 24, 2026Feb 12, 2026Mar 3, 2026Mar 22, 2026
Updated 25d ago

Evaluation Results

MethodLinks
2025.12
0.7659-
2025.12
0.7228-
2025.12
0.717-
2026.01
0.642-
2026.01
0.625-
2026.02
0.5634-
2026.01
0.557-
2026.02
0.5497-
2026.02
0.5453-
0.538-
2026.01
0.479-
2026.03
0.479-
2026.03
0.477-
0.468-
2026.03
0.468-
2026.01
0.467-
2026.03
0.467-
2026.03
0.453-
2026.03
0.452-
2026.03
0.452-
2026.03
0.441-
2026.01
0.44-
2026.03
0.44-
2026.01
0.436-
2026.03
0.431-
0.428-
0.427-
2026.01
0.424-
2026.03
0.424-
2026.01
0.41-
0.41-
2026.03
0.41-
2026.01
0.409-
2026.01
0.409-
0.409-
2026.03
0.405-
2026.01
0.403-
0.403-
2026.01
0.402-
0.402-
2026.03
0.4-
2026.03
0.395-
2026.01
0.393-
2026.01
0.387-
2026.01
0.376-
0.376-
2026.03
0.358-
0.347-
2026.03
0.302-
2026.01
0.284-
0.284-
2025.11
0.26-
2025.11
0.26-
2026.03
0.259-
2026.03
0.255-
2025.11
0.253-
2026.03
0.249-
2025.11
0.247-
2025.11
0.241-
2025.11
0.238-
2025.11
0.235-
2026.03
0.231-
2025.11
0.226-
2025.11
0.224-
2025.11
0.22-
2026.03
0.173-
2026.03
0.173-
2026.03
0.172-
2026.03
0.17-
2026.01
0.166-
2026.03
0.151-
2026.03
0.11-
2025.09
-0.5096
2025.09
-0.4444
2025.09
-0.5348
2025.09
-0.52
2025.09
-0.5378
2025.09
-0.0563
2025.09
-0.0607
2025.09
-0.0622
2025.09
-0.0533
2025.09
-0.0489