Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization

About

Shuffling strategies for stochastic gradient descent (SGD), including incremental gradient, shuffle-once, and random reshuffling, are supported by rigorous convergence analyses for arbitrary within-epoch permutations. In particular, random reshuffling is known to improve optimization constants relative to cyclic and shuffle-once schemes. However, existing theory offers limited guidance on how to design new data-ordering schemes that further improve optimization constants or stability beyond random reshuffling. In this paper, we design a pipeline using a large language model (LLM)-guided program evolution framework to discover an effective shuffling rule for without-replacement SGD. Abstracting from this instance, we identify two fundamental structural components: block reshuffling and paired reversal. We analyze these components separately and show that block reshuffling strictly reduces prefix-gradient variance constants within the unified shuffling framework, yielding provable improvements over random reshuffling under mild conditions. Separately, we show that paired reversal symmetrizes the epoch map and cancels the leading order-dependent second-order term, reducing order sensitivity from quadratic to cubic in the step size. Numerical experiments with the discovered algorithm validate the theory and demonstrate consistent gains over standard shuffling schemes across convex and nonconvex benchmarks.

Lam M. Nguyen, Dzung T. Phan, Jayant Kalagnanam• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationMNIST (train)
Training Loss1.00e-6
38
Image ClassificationFashion-MNIST (train)--
21
ClassificationDigits (train)
Training Loss0.1996
12
Classificationa9a
Training Loss0.323
8
RegressionCalifornia
Best Training Loss0.3716
8
RegressionBoston
Training Loss0.3075
8
Classificationbreast_cancer
Best Training Loss0.0388
8
Showing 7 of 7 rows

Other info

Follow for update