Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Compositional Generalization in Autoregressive Models via Logit Composition

About

Composing autoregressive models remains a core challenge in understanding how large language models can combine behaviors or skills learned across tasks. We introduce a new and principled composition strategy for autoregressive systems, inspired by composition methods developed for diffusion models. Under a factorized-conditionals assumption, we show that the resulting composition is projective: each component model preserves control over its own designated subspace of the output distribution avoiding interference between models. This property is further preserved under smooth reparameterizations of the output space, yielding a feature-space theorem. Finally, we show that composition preserves length-generalizing behavior when the factorization assumptions and component guarantees hold uniformly at the target length. These results provide a principled understanding of when model composition and merging succeed in autoregressive systems and identify conditions under which their interactions remain stable.

Aakash Kumar, Maria Sofia Bucarelli, Emanuele Natale• 2026

Related benchmarks

TaskDatasetResultRank
CodingHumanEval+
Pass@130.5
164
MathematicsMATH
MATH Accuracy24.2
136
CodingMBPP+
Pass@139.4
117
MathematicsGSM8K
Accuracy41.2
24
Showing 4 of 4 rows

Other info

Follow for update