Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Performance Estimation on HLE (Humanity's Last Exam) 1% subset

3.5MAE

Scales++

3.4283.9144.44.886Oct 30, 2025
Updated 15d ago

Evaluation Results

MethodLinks
2025.10
3.50.3
2025.10
4.20.3
2025.10
5.30.5