Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond Query Memorization: Large Language Model Routing with Query Decomposition and Historical Matching

About

Optimizing the trade-off among predictive performance and computational cost is a central focus in the deployment of Large Language Models (LLMs). Current routing methods primarily rely on direct mapping from queries to models based on surface-level features, making them susceptible to the memorization trap and leading to poor generalizability on out-of-distribution (OOD) data. In this paper, we propose DecoR, a novel routing framework that recasts the routing task as a matching process of sifting similar queries from historical logs, effectively mitigating the memorization trap. To enhance matching accuracy, we introduce a query capability deconstruction method that decouples linguistic surface forms from task-intrinsic requirements, directing matching toward capability dimensions to ground decisions in essential task attributes. Furthermore, we develop CodaSet, a comprehensive benchmark for assessing routing generalization, where experimental results demonstrate that DecoR maintains superior accuracy while substantially lowering inference costs across both in-distribution and OOD settings. All the codes and data are available at https://github.com/lvbotenbest/DecoR.

Bo Lv, Jingbo Sun• 2026

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningCodaSet ID GSM8k (test)
Accuracy0.9559
16
Symbolic and Logical ReasoningCodaSet BBH ID (test)
Accuracy93.89
16
Code GenerationMBPP CodaSet OOD (test)
Performance (%)76.72
16
Holistic EvaluationCodaSet ID Average (test)
Accuracy89.35
16
Instruction FollowingCodaSet ID IFEVAL (test)
Accuracy87.56
16
Multi-turn Dialogue EvaluationMT_Bench CodaSet OOD (test)
Performance (%)98.16
16
General Language ModelingCodaSet OOD Average (test)
Performance (%)86.49
16
Mathematical ReasoningMath_500 CodaSet OOD (test)
Accuracy (%)85.51
16
Multi-discipline Knowledge EvaluationCodaSet ID MMLU-PRO (test)
Accuracy80.36
16
Showing 9 of 9 rows

Other info

Follow for update