Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection

About

The inherent capabilities of a language model (LM) and the reasoning strategies it employs jointly determine its performance in reasoning tasks. While test-time scaling is regarded as an effective approach to tackling complex reasoning tasks, it incurs substantial computational costs and often leads to "overthinking", where models become trapped in "thought pitfalls". To address this challenge, we propose Route-To-Reason (RTR), a novel unified routing framework that dynamically allocates both LMs and reasoning strategies according to task difficulty under budget constraints. RTR learns compressed representations of both expert models and reasoning strategies, enabling their joint and adaptive selection at inference time. This method is low-cost, highly flexible, and can be seamlessly extended to arbitrary black-box or white-box models and strategies, achieving true plug-and-play functionality. Extensive experiments across seven open source models and four reasoning strategies demonstrate that RTR achieves an optimal trade-off between accuracy and computational efficiency among all baselines, achieving higher accuracy than the best single model while reducing token usage by over 60%.

Zhihong Pan, Kai Zhang, Yuze Zhao, Yupeng Han• 2025

Related benchmarks

TaskDatasetResultRank
Question AnsweringWikiQA
Accuracy26
29
Question AnsweringTATQA
F17.96
27
Continual routing2WikiMultiHop
Accuracy59.4
22
Continual routingGSM8K
Accuracy91.6
22
Continual routingAverage
Accuracy74.7
22
Continual routingMMLU
Accuracy73.7
22
RoutingSVAMP
Accuracy92.4
5
RoutingOOD Average (HotpotQA, GPQA, SVAMP)
Acc62.4
5
Mathematical ReasoningGSM8k (in-distribution)
Accuracy91.3
5
Multi-hop Question Answering2WikiMultiHop (in-distribution)
Accuracy57.9
5
Showing 10 of 14 rows

Other info

Follow for update