Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer

About

Learning-to-Defer routes each input to the expert that minimizes expected cost, but it assumes that the information available to every expert is fixed at decision time. Many modern systems violate this assumption: after selecting an expert, one may also choose what additional information that expert should receive, such as retrieved documents, tool outputs, or escalation context. We study this problem and call it Learning-to-Defer with advice. We show that a broad family of natural separated surrogates, which learn routing and advice with distinct heads, is inconsistent even in the smallest non-trivial setting. We then introduce an augmented surrogate that operates on the composite expert--advice action space and prove an $\mathcal{H}$-consistency guarantee together with an excess-risk transfer bound, yielding recovery of the Bayes-optimal policy in the limit. Experiments on tabular, language, and multi-modal tasks show that the resulting method improves over standard Learning-to-Defer while adapting its advice-acquisition behavior to the cost regime; a synthetic benchmark confirms the failure mode predicted for separated surrogates.

Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi• 2026

Related benchmarks

TaskDatasetResultRank
Learning to DeferCIFAR-10H (test)
Coverage53
25
Classification with expert deferralCIFAR-10 redundant expert suite (val)
System Accuracy91.9
21
Learning to DeferCIFAR-10 with redundant synthetic experts
System Accuracy91.9
21
Learning to DeferCIFAR-10H
System Accuracy96.1
18
Classification with DeferralCovertype (test)
System Accuracy93.4
7
Showing 5 of 5 rows

Other info

Follow for update