Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ALE Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Computer Science Problem SolvingALE-bench
Average Score @5661.64
10
Competitive Programming Agent EvaluationALE Bench
Final Performance1,909.4
4
Showing 2 of 2 rows