Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Agentic Task Solving on DataBench

92.7Pass@3

70.02875.91481.887.686May 8, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2026.05
92.796
2026.05
89.994.7
2026.05
88.493.3
2026.05
85.789.3
2026.05
79.782.7
2026.05
77.685.3
2026.05
77.284
2026.05
72.580
2026.05
70.981.3