Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning on HLE OOD
Loading...
38.6
Accuracy
MTRouter
7.192
15.346
23.5
31.654
Apr 26, 2026
Accuracy
Total Cost ($)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Total Cost ($)
MTRouter
Routing Level=Multi-Tu...
2026.04
38.6
31.2
LLM Router
Routing Level=Multi-Tu...
2026.04
36
35.6
Router-R1
Routing Level=Multi-Tu...
2026.04
35.1
60.7
GPT-5
Routing Level=Single-M...
2026.04
34.8
65.3
OpenRouter
Routing Level=Multi-Tu...
2026.04
34
154.3
EmbedLLM
Routing Level=Single-T...
2026.04
33.6
56.8
AvengersPro
Routing Level=Single-T...
2026.04
30.6
33.3
DeepSeek-V3.2
Routing Level=Single-M...
2026.04
28.7
22.2
Random Router
Routing Level=Multi-Tu...
2026.04
23.8
14.7
Kimi-K2
Routing Level=Single-M...
2026.04
20.1
9.8
RouterDC
Routing Level=Single-T...
2026.04
17.9
13.5
GPT-OSS-120B
Routing Level=Single-M...
2026.04
11.4
2.1
MiniMax-M2
Routing Level=Single-M...
2026.04
8.9
9.8
Gemini-2.5-Flash-Lite
Routing Level=Single-M...
2026.04
8.4
2.1
Feedback
Search any
task
Search any
task