Discovering Learning-Friendly Generation Orders for Sequential Computation

About

Sequential computation via autoregressive generation can make difficult tasks learnable, but the generation order of intermediate states strongly affects whether training succeeds. We address the problem of discovering a learning-friendly target order automatically, rather than relying on task-specific design. Our key observation is that learning-friendly orders cause a faster loss drop in the early stage of training. We exploit this by \emph{loss profiling}, which ranks candidate orders by the early-stage loss of a single short run. To handle the factorial candidate space, we wrap loss profiling in a hierarchical global -- local search over block- and within-block-level orderings. On six order-sensitive tasks, the method discovers effective orders up to $L=13$ from random initialization and up to $L=40$ from structured initialization, lifting success rates from about 10\% to near 100\%. On integer multiplication, it rediscovers the reverse-digit order that was reported to be efficient in prior studies. On delay dynamical systems, as a case study of multi-variate recurrences, learnability varies sharply even among valid topological sorts of the dependency graph: loss profiling identifies a learning-friendly one, and the global search even discovers orders surpassing hand-designed candidates.

Yuta Sato, Kazuhiko Kawamoto, Hiroshi Kera• 2025

Related benchmarks

Task	Dataset	Result
CUBIC	Synthetic Arithmetic Tasks	Success Rate1.00e+4	14
MLP	Synthetic Arithmetic Tasks	Success Rate100	14
RELU	Synthetic Arithmetic Tasks	Success Rate0.996	14
SINE	Synthetic Arithmetic Tasks	Success Rate100	14
Square	Synthetic Arithmetic Tasks	Success Rate100	14
TRIANGLE	Synthetic Arithmetic Tasks	Success Rate100	14
Multiplication	PROD L=20	Success Rate (Discovered)98.2	2
Multiplication	PROD L=10	Success Rate (Discovered)100	1
Multiplication	PROD L=12	Success Rate (Discovered)51.4	1

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord