Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool-Integrated Reasoning on Olympiad Bench

52.75Test Accuracy

Seq-ALP

48.205249.385150.56551.7449Mar 19, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.03
52.75
2026.03
51.55
2026.03
51.08
2026.03
50.65
2026.03
49.59
2026.03
48.38