Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on HLE (test)
Loading...
26
Accuracy
MTRouter
4.784
10.292
15.8
21.308
Apr 26, 2026
Accuracy
Total Cost ($)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Total Cost ($)
MTRouter
Routing Level=Multi-Tu...
2026.04
26
35
GPT-5
Routing Level=Single-M...
2026.04
25.1
61.8
EmbedLLM
Routing Level=Single-T...
2026.04
24.8
61.4
Router-R1
Routing Level=Multi-Tu...
2026.04
24.2
51.9
LLM Router
Routing Level=Multi-Tu...
2026.04
24
56.2
AvengersPro
Routing Level=Single-T...
2026.04
23.7
47.5
Random Router
Routing Level=Multi-Tu...
2026.04
20
16.9
OpenRouter
Routing Level=Multi-Tu...
2026.04
18.3
138.5
DeepSeek-V3.2
Routing Level=Single-M...
2026.04
15.6
22.4
RouterDC
Routing Level=Single-T...
2026.04
12.8
10.3
Kimi-K2
Routing Level=Single-M...
2026.04
11.4
12
GPT-OSS-120B
Routing Level=Single-M...
2026.04
9.7
0.7
MiniMax-M2
Routing Level=Single-M...
2026.04
7.8
18.1
Gemini-2.5-Flash-Lite
Routing Level=Single-M...
2026.04
5.6
3
Feedback
Search any
task
Search any
task