Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RouteLMT: Learned Sample Routing for Hybrid LLM Translation Deployment

About

Large Language Models (LLMs) have achieved remarkable performance in Machine Translation (MT), but deploying them at scale remains prohibitively expensive. A widely adopted remedy is the hybrid system paradigm, which balances cost and quality by serving most requests with a small model and selectively routing a fraction to a large model. However, existing routing strategies often rely on heuristics, external predictors, or absolute quality estimation, which fail to capture whether the large model actually provides a worthwhile improvement over the small one. In this paper, we formulate routing as a budget allocation problem and identify marginal gain, i.e., the large model's improvement over the small model, as the optimal signal for budgeted decisions. Building on this, we propose \textbf{RouteLMT} (routing for LLM-based MT), an efficient in-model router that predicts this expected gain by probing the small translators prompt-token representation, without requiring external models or hypothesis decoding. Extensive experiments demonstrate that our RouteLMT outperforms heuristics, quality/difficulty estimation baselines, achieving a superior quality-budget Pareto frontier. Furthermore, we analyze regression risks and show that a simple guarded variant can mitigate severe quality losses.

Yingfeng Luo, Hongyu Liu, Dingyang Lin, Kaiyan Chang, Chenglong Wang, Bei Li, Quan Du, Tong Xiao, Jingbo Zhu• 2026

Related benchmarks

TaskDatasetResultRank
Budgeted Hybrid RoutingMedical En→Zh
HitRate@p47.66
12
Budgeted Hybrid RoutingMedical En→Ru
Hit Rate@p54.72
12
Budgeted Hybrid RoutingMedical Zh→En
HitRate@p53.03
12
Budgeted Hybrid RoutingMedical Ru→En
HitRate@p51.57
12
Budgeted Hybrid RoutingMedical Average Global
Spearman Correlation0.3
12
Budgeted Hybrid RoutingColloquial domain (test)
Spearman Correlation (Avg.)0.3
12
Translation RoutingEn-Zh
Hit Rate@p56.24
12
Translation RoutingEn-Ru
Hit Rate@p60.5
12
Translation RoutingZh-En
HitRate@p0.5564
12
Translation RoutingRu-En
HitRate@p56.93
12
Showing 10 of 11 rows

Other info

Follow for update