CONCUR: A Framework for Continual Constrained and Unconstrained Routing

About

AI tasks differ in complexity and are best addressed with different computation strategies (e.g., combinations of models and decoding methods). Hence, an effective routing system that maps tasks to the appropriate strategies is crucial. Most prior methods build the routing framework by training a single model across all strategies, which demands full retraining whenever new strategies appear and leads to high overhead. Attempts at such continual routing, however, often face difficulties with generalization. Prior models also typically use a single input representation, limiting their ability to capture the full complexity of the routing problem and leading to sub-optimal routing decisions. To address these gaps, we propose CONCUR, a continual routing framework that supports both constrained and unconstrained routing (i.e., routing with or without a budget). Our modular design trains a separate predictor model for each strategy, enabling seamless incorporation of new strategies with low additional training cost. Our predictors also leverage multiple representations of both tasks and computation strategies to better capture overall problem complexity. Experiments on both in-distribution and out-of-distribution, knowledge- and reasoning-intensive tasks show that our method outperforms the best single strategy and strong existing routing techniques with higher end-to-end accuracy and lower inference cost in both continual and non-continual settings, while also reducing training cost in the continual setting.

Peter Baile Chen, Weiyue Li, Dan Roth, Michael Cafarella, Samuel Madden, Jacob Andreas• 2025

Related benchmarks

Task	Dataset	Result
Multi-Task Reasoning	Average (2WikiMultiHop, MMLU, GSM8k) (in-distribution)	Accuracy75.2	29
Continual routing	2WikiMultiHop	Accuracy59.5	22
Continual routing	MMLU	Accuracy74.5	22
Continual routing	GSM8K	Accuracy91.7	22
Continual routing	Average	Accuracy75.2	22
Multi-hop Question Answering	2WikiMultiHop (in-distribution)	Accuracy59.5	5
Multi-task Language Understanding	MMLU in-distribution	Accuracy74.4	5
Routing	HotpotQA	Accuracy60.6	5
Routing	OOD Average (HotpotQA, GPQA, SVAMP)	Acc62.6	5
Mathematical Reasoning	GSM8k (in-distribution)	Accuracy91.6	5

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord