Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MICA: Multivariate Infini Compressive Attention for Time Series Forecasting

About

Multivariate forecasting with Transformers faces a core scalability challenge: modeling cross-channel dependencies via attention compounds attention's quadratic sequence complexity with quadratic channel scaling, making full cross-channel attention impractical for high-dimensional time series. We propose Multivariate Infini Compressive Attention (MICA), an architectural design to extend channel-independent Transformers to channel-dependent forecasting. By adapting efficient attention techniques from the sequence dimension to the channel dimension, MICA adds a cross-channel attention mechanism to channel-independent backbones that scales linearly with channel count and context length. We evaluate channel-independent Transformer architectures with and without MICA across multiple forecasting benchmarks. MICA reduces forecast error over its channel-independent counterparts by 5.4% on average and up to 25.4% on individual datasets, highlighting the importance of explicit cross-channel modeling. Moreover, models with MICA rank first among deep multivariate Transformer and MLP baselines. MICA models also scale more efficiently with respect to both channel count and context length than Transformer baselines that compute attention across both the temporal and channel dimensions, establishing compressive attention as a practical solution for scalable multivariate forecasting.

Willa Potosnak, Nina \.Zukowska, Micha{\l} Wili\'nski, Dan Howarth, Ignacy St\k{e}pka, Mononito Goswami, Artur Dubrawski• 2026

Related benchmarks

TaskDatasetResultRank
Time Series ForecastingETT1
RMSE231.3
62
Time Series ForecastingIowa PLOWS 5min
MAE1.327
13
Time Series ForecastingJena Weather Hourly
MAE9.387
13
Time Series ForecastingM-DENSE Hourly
MAE87.412
13
Time Series ForecastingLoop-Seattle Daily
MAE2.939
13
Time Series ForecastingJena Weather H
RMSE33.999
13
Time Series ForecastingM-DENSE (H)
RMSE197
13
Time Series ForecastingSimglucose 5min
MAE4.241
13
Time Series ForecastingIowa IHOP SMEX02 5min
MAE1.662
13
Time Series ForecastingETT1 Hourly
MAE5.403
13
Showing 10 of 32 rows

Other info

Follow for update