Minimax Optimality and Spectral Routing for Majority-Vote Ensembles under Markov Dependence
About
Majority-vote ensembles achieve variance reduction by averaging over diverse, approximately independent base learners. When training data exhibits Markov dependence, as in time-series forecasting, reinforcement learning (RL) replay buffers, and spatial grids, this classical guarantee degrades in ways that existing theory does not fully quantify. We provide a minimax characterization of this phenomenon for discrete classification in a fixed-dimensional Markov setting, together with an adaptive algorithm that matches the rate on a graph-regular subclass. We first establish an information-theoretic lower bound for stationary, reversible, geometrically ergodic chains in fixed ambient dimension, showing that no measurable estimator can achieve excess classification risk better than $\Omega(\sqrt{\Tmix/n})$. We then prove that, on the AR(1) witness subclass underlying the lower-bound construction, dependence-agnostic uniform bagging is provably suboptimal with excess risk bounded below by $\Omega(\Tmix/\sqrt{n})$, exhibiting a $\sqrt{\Tmix}$ algorithmic gap. Finally, we propose \emph{adaptive spectral routing}, which partitions the training data via the empirical Fiedler eigenvector of a dependency graph and achieves the minimax rate $\mathcal{O}(\sqrt{\Tmix/n})$ up to a lower-order geometric cut term on a graph-regular subclass, without knowledge of $\Tmix$. Experiments on synthetic Markov chains, 2D spatial grids, the 128-dataset UCR archive, and Atari DQN ensembles validate the theoretical predictions. Consequences for deep RL target variance, scalability via Nystr\"om approximation, and bounded non-stationarity are developed as supporting material in the appendix.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Time-series classification | 128 UCR datasets | Avg Accuracy83.6 | 20 | |
| European crop classification | Sentinel-2 | Excess Risk8.7 | 6 | |
| European crop classification | BigEarthNet | Excess Risk10.6 | 6 | |
| Sea surface temperature grid prediction | NOAA SST | Excess Risk18.8 | 6 | |
| Deep Reinforcement Learning | Breakout Atari 2600 | IQM Return364 | 4 | |
| Deep Reinforcement Learning | SpaceInvaders Atari 2600 | IQM Return1.03e+3 | 4 | |
| Deep Reinforcement Learning | Seaquest Atari 2600 | IQM Return1.64e+3 | 4 | |
| Deep Reinforcement Learning | Q*bert Atari 2600 | IQM Return8.16e+3 | 4 | |
| Deep Reinforcement Learning | Pong Atari 2600 | IQM Return19.8 | 4 | |
| Deep Reinforcement Learning | Montezuma's Revenge Atari 2600 | IQM Return0.00e+0 | 4 |