Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LOCUS: Low-Dimensional Model Embeddings for Efficient Model Exploration, Comparison, and Selection

About

The rapidly growing ecosystem of Large Language Models (LLMs) makes it increasingly challenging to manage and utilize the vast and dynamic pool of models effectively. We propose LOCUS, a method that produces low-dimensional vector embeddings that compactly represent a language model's capabilities across queries. LOCUS is an attention-based approach that generates embeddings by a deterministic forward pass over query encodings and evaluation scores via an encoder model, enabling seamless incorporation of new models to the pool and refinement of existing model embeddings without having to perform any retraining. We additionally train a correctness predictor that uses model embeddings and query encodings to achieve state-of-the-art routing accuracy on unseen queries. Experiments show that LOCUS needs up to 4.8x fewer query evaluation samples than baselines to produce informative and robust embeddings. Moreover, the learned embedding space is geometrically meaningful: proximity reflects model similarity, enabling a range of downstream applications including model comparison and clustering, model portfolio selection, and resilient proxies of unavailable models.

Shivam Patel, William Cocke, Gauri Joshi• 2026

Related benchmarks

TaskDatasetResultRank
Correctness PredictionLogiQA
Accuracy67.75
18
Correctness PredictionTruthQA
Accuracy69.12
18
Correctness PredictionGPQA
Accuracy79.42
18
Correctness PredictionOverall Combined Datasets
Accuracy70.03
18
Model RoutingModel Routing Suite MathQA, LogiQA, MedQA, PIQA, TruthQA, MMLU, GSM8k, GPQA, ASDiv, SoQA
Overall Accuracy64.7
18
Correctness PredictionASDIV
Accuracy96.5
18
Correctness PredictionMedQA
Accuracy60.89
18
Correctness PredictionMMLU
Accuracy65.39
18
Correctness PredictionGSM8K
Accuracy72.85
18
Correctness PredictionSoQA
Accuracy66.83
18
Showing 10 of 14 rows

Other info

Follow for update