Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reasoning on Humanity's Last Exam (HLE) (test)
Loading...
72.19
Accuracy
HEART
9.5924
25.8437
42.095
58.3463
Sep 26, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
HEART
Model=Gemini 3 Pro, Pr...
2025.09
72.19
Self Reflection
Model=Gemini 3 Pro, Pr...
2025.09
68.58
CoT
Model=Gemini 3 Pro, Pr...
2025.09
68.14
Wait
Model=Gemini 3 Pro, Pr...
2025.09
66.3
HEART
Model=Gemini 3 Flash,...
2025.09
58.89
Wait
Model=Gemini 3 Flash,...
2025.09
49.37
Self Reflection
Model=Gemini 3 Flash,...
2025.09
48.36
CoT
Model=Gemini 3 Flash,...
2025.09
46.83
Vanilla
Model=Gemini 3 Pro, Pr...
2025.09
40.8
Vanilla
Model=Gemini 3 Flash,...
2025.09
12
Feedback
Search any
task
Search any
task