A Regime Theory of Controller Class Selection for LLM Action Decisions

About

Deployed language and vision-language models must decide, on each input, whether to answer directly, retrieve evidence, defer to a stronger model, or abstain. Contrary to the common monotonicity intuition, greater per-input expressivity is not uniformly beneficial in finite samples: under identical strict cross-validation, different benchmarks prefer different controller classes. This reflects a finite-sample limitation of instance-level uncertainty signals, which can be exhausted at a distribution-dependent scale. We organize controllers into a nested lattice of four classes: fixed actions, partition routers, instance-level controllers, and prior-gated controllers, ordered by complexity. We prove a regime theory that turns three data-estimable bottlenecks into a class choice: how much improvement is possible beyond the best fixed action, whether there are enough samples for instance-level controllers to make reliable decisions, and how much improvement a coarse partition router can recover when instance-level signal is unreliable. The resulting Bernstein-tight threshold has a matching information-theoretic lower bound, and strict nested cross-validation provably selects a near-best class. Across SMS-Spam, HallusionBench, A-OKVQA, and FOLIO, the predicted class matches the empirical winner; the prior-gated controller wins on TextVQA when OCR tokens supply a label-free prediction-time prior. Code is available at https://github.com/Anonymous-Awesome-Submissions/Regime-Theory.

Zhaoyang Jiang, Zhizhong Fu, Yunsoo Kim, Jiacong Mi, Zicheng Li, Xuanqi Peng, Honghan Wu• 2026

Related benchmarks

Task	Dataset	Result
Visual Question Answering	TextVQA OCR n=5000 (val)	Answer Loss0.8212	4
Knowledge-based Visual Question Answering	A-OKVQA n=1145 (held-out)	Per-Class Loss0.3805	3
Logical reasoning	FOLIO n=203 (held-out)	Per-Class Loss0.7195	3
Spam Detection	SMS-Spam n=1114 (held-out)	Per-Class Loss0.059	3
Visual Hallucination Detection	HallusionBench n=920 (held-out)	Per-Class Loss0.897	3

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord