Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLMRank: Understanding LLM Strengths for Model Routing

About

The rapid growth of large language models (LLMs) with diverse capabilities, latency and computational costs presents a critical deployment challenge: selecting the most suitable model for each prompt to optimize the trade-off between performance and efficiency. We introduce LLMRank, a prompt-aware routing framework that leverages rich, human-readable features extracted from prompts, including task type, reasoning patterns, complexity indicators, syntactic cues, and signals from a lightweight proxy solver. Unlike prior one-shot routers that rely solely on latent embeddings, LLMRank predicts per-model utility using a neural ranking model trained on RouterBench, comprising 36,497 prompts spanning 11 benchmarks and 11 state-of-the-art LLMs, from small efficient models to large frontier systems. Our approach achieves up to 89.2% of oracle utility, while providing interpretable feature attributions that explain routing decisions. Extensive studies demonstrate the importance of multifaceted feature extraction and the hybrid ranking objective, highlighting the potential of feature-driven routing for efficient and transparent LLM deployment.

Shubham Agrawal, Prasang Gupta• 2025

Related benchmarks

TaskDatasetResultRank
LLM RoutingBBEH
Top-1 Accuracy34.5
14
LLM RoutingBBEH (val)
Top-1 Acc37.5
14
LLM RoutingSuperGPQA
Top-1 Acc52.6
14
LLM RoutingAverage across Benchmarks (val)
Avg Top-1 Acc65.6
14
LLM RoutingMMLU-PRO (val)
Top-1 Acc78.5
14
LLM RoutingMedMCQA (val)
Top-1 Acc92.7
14
LLM RoutingMMLU-PRO, SUPERGPQA, MEDMCQA, BBEH (test)
MMLU-PRO Top-1 Acc73.8
14
LLM RoutingMMLU-Pro
Top-1 Acc76.7
14
LLM RoutingSUPERGPQA (val)
Top-1 Acc0.536
14
LLM RoutingMedMCQA
Top-1 Acc81.2
14
Showing 10 of 10 rows

Other info

Follow for update