Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ML Engineering on MLE-Bench official (test)

71.2Medal Rate (Low)

Gome GPT-5 (24h)

1.5219.6137.755.79Mar 2, 2026
Updated 3mo ago

Evaluation Results

MethodLinks
2026.03
71.228.126.740.4
2026.03
68.221.122.235.1
2026.03
65.245.631.148.4
2026.03
63.6---
2026.03
63.627.224.436.4
2026.03
62.126.324.436.4
2026.03
62.136.833.343.6
2026.03
59.120.222.232.5
2026.03
552221.731.6
2026.03
51.510.517.824
2026.03
51.5---
2026.03
509.72023.4
2026.03
48.529.824.434.2
2026.03
479.72022.7
2026.03
42.412.32022.7
2026.03
34.38.81016.9
2026.03
193.25.68.6
2026.03
11.52.21.95.1
2026.03
4.2001.3