Beyond Query Memorization: Large Language Model Routing with Query Decomposition and Historical Matching
About
Optimizing the trade-off among predictive performance and computational cost is a central focus in the deployment of Large Language Models (LLMs). Current routing methods primarily rely on direct mapping from queries to models based on surface-level features, making them susceptible to the memorization trap and leading to poor generalizability on out-of-distribution (OOD) data. In this paper, we propose DecoR, a novel routing framework that recasts the routing task as a matching process of sifting similar queries from historical logs, effectively mitigating the memorization trap. To enhance matching accuracy, we introduce a query capability deconstruction method that decouples linguistic surface forms from task-intrinsic requirements, directing matching toward capability dimensions to ground decisions in essential task attributes. Furthermore, we develop CodaSet, a comprehensive benchmark for assessing routing generalization, where experimental results demonstrate that DecoR maintains superior accuracy while substantially lowering inference costs across both in-distribution and OOD settings. All the codes and data are available at https://github.com/lvbotenbest/DecoR.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | CodaSet ID GSM8k (test) | Accuracy0.9559 | 16 | |
| Symbolic and Logical Reasoning | CodaSet BBH ID (test) | Accuracy93.89 | 16 | |
| Code Generation | MBPP CodaSet OOD (test) | Performance (%)76.72 | 16 | |
| Holistic Evaluation | CodaSet ID Average (test) | Accuracy89.35 | 16 | |
| Instruction Following | CodaSet ID IFEVAL (test) | Accuracy87.56 | 16 | |
| Multi-turn Dialogue Evaluation | MT_Bench CodaSet OOD (test) | Performance (%)98.16 | 16 | |
| General Language Modeling | CodaSet OOD Average (test) | Performance (%)86.49 | 16 | |
| Mathematical Reasoning | Math_500 CodaSet OOD (test) | Accuracy (%)85.51 | 16 | |
| Multi-discipline Knowledge Evaluation | CodaSet ID MMLU-PRO (test) | Accuracy80.36 | 16 |