Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Expert-Level Reasoning on HLE (text-only subset, val)
Loading...
52.2
Inference Accuracy
ReThinker
15.904
25.327
34.75
44.173
Feb 4, 2026
Inference Accuracy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Inference Accuracy
ReThinker
Model Category=Inferen...
2026.02
52.2
Gemini-3-Pro
Model Category=Foundat...
2026.02
38.3
GPT-5-high
Model Category=Foundat...
2026.02
35.2
MiroThinker-v1.0
Model Category=Inferen...
2026.02
33.4
ReThinker
Model Category=Inferen...
2026.02
33.1
Tongyi DeepResearch
Model Category=Inferen...
2026.02
32.9
GLM-4.6
Model Category=Foundat...
2026.02
30.4
DeepSeek-V3.2
Model Category=Foundat...
2026.02
27.2
Kimi Researcher
Model Category=Inferen...
2026.02
26.9
OpenAI DeepResearch
Model Category=Inferen...
2026.02
26.6
Claude-4.5-Sonnet
Model Category=Foundat...
2026.02
24.5
Kimi K2
Model Category=Foundat...
2026.02
18.1
WebExplorer
Model Category=Inferen...
2026.02
17.3
Feedback
Search any
task
Search any
task