Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Performance Estimation on HLE (Humanity's Last Exam) 0.5% subset

5.6MAE

Scales++

5.565.836.16.37Oct 30, 2025
Updated 15d ago

Evaluation Results

MethodLinks
2025.10
5.60.6
2025.10
6.30.7
2025.10
6.60.3