Model Compression with Exact Budget Constraints via Riemannian Manifolds

About

Assigning one of K options to each of N groups under a total cost budget is a recurring problem in efficient AI, including mixed-precision quantization, non-uniform pruning, and expert selection. The objective, typically model loss, depends jointly on all assignments and does not decompose across groups, preventing combinatorial solvers from directly optimizing the true objective and forcing reliance on proxy formulations. Methods such as evolutionary search evaluate the actual loss but lack gradient information, while penalty-based approaches enforce the budget only approximately and often require extensive hyperparameter tuning. We present a new approach by showing that, under softmax relaxation, the budget constraint defines a smooth Riemannian manifold in logit space with unusually simple geometry. The normal vector admits a closed-form expression, shifting logits along the cost vector changes expected cost monotonically, and vector transport reduces to a single inner product. Building on these properties, we propose Riemannian Constrained Optimization (RCO), which augments a standard Adam step with tangent projection, binary-search retraction, and momentum transport. Combined with Gumbel straight-through estimation and budget-constrained dynamic programming for discrete feasibility, RCO enables first-order optimization of the actual loss under exact budget enforcement without introducing constraint-specific hyperparameters. Across both synthetic benchmarks and realistic LLM compression settings, RCO matches or exceeds state-of-the-art methods while often requiring substantially less wall-clock time. Source code is available at https://github.com/IST-DASLab/RCO.

Michael Helcig, Dan Alistarh• 2026

Related benchmarks

Task	Dataset	Result
Language Modeling	C4	Perplexity17.4	1688
Commonsense Reasoning	HellaSwag	HellaSwag Accuracy75.7	897
Multiple-choice Question Answering	MMLU	Accuracy73.3	222
Language Modeling	FineWeb-Edu	PPL11.14	141
Coreference Resolution	WinoGrande	Accuracy71.4	72
Boolean Question Answering	BoolQ	Acc (Normalized)88.5	20
Aggregated LLM Evaluation	8 Standard Benchmarks Aggregate	Average Accuracy71	5
General Language Understanding and Reasoning	General LLM Evaluation Suite ARC-C ARC-E BoolQ HellaSwag MMLU OBQA RTE WinoGrande	ARC-Challenge Accuracy58.4	5

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord