Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dataset-Driven Channel Masks in Transformers for Multivariate Time Series

About

Recent advancements in foundation models have been successfully extended to the time series (TS) domain, facilitated by the emergence of large-scale TS datasets. However, previous efforts have primarily Capturing channel dependency (CD) is essential for modeling multivariate time series (TS), and attention-based methods have been widely employed for this purpose. Nonetheless, these methods primarily focus on modifying the architecture, often neglecting the importance of dataset-specific characteristics. In this work, we introduce the concept of partial channel dependence (PCD) to enhance CD modeling in Transformer-based models by leveraging dataset-specific information to refine the CD captured by the model. To achieve PCD, we propose channel masks (CMs), which are integrated into the attention matrices of Transformers via element-wise multiplication. CMs consist of two components: 1) a similarity matrix that captures relationships between the channels, and 2) dataset-specific and learnable domain parameters that refine the similarity matrix. We validate the effectiveness of PCD across diverse tasks and datasets with various backbones. Code is available at this repository: https://github.com/YonseiML/pcd.

Seunghan Lee, Taeyoung Park, Kibok Lee• 2024

Related benchmarks

TaskDatasetResultRank
Time Series ForecastingETTh1
MSE0.405
836
Time Series ForecastingECL
MSE0.14
294
Time Series ForecastingPeMS08
MSE0.109
229
Time Series ForecastingPeMS04
MSE0.093
169
Time Series ForecastingExchange
MSE0.088
98
Time Series ForecastingETTh2
MSE0.328
88
Multivariate Time-series Forecastingsolar
MAE0.585
74
Time Series ForecastingETTh1
MSE0.492
63
Time Series ForecastingWeather
MSE0.219
55
Time Series ForecastingECL
MSE0.149
24
Showing 10 of 25 rows

Other info

Follow for update