Soft-DTW: a Differentiable Loss Function for Time-Series
About
We propose in this paper a differentiable learning loss between time series, building upon the celebrated dynamic time warping (DTW) discrepancy. Unlike the Euclidean distance, DTW can compare time series of variable size and is robust to shifts or dilatations across the time dimension. To compute DTW, one typically solves a minimal-cost alignment problem between two time series using dynamic programming. Our work takes advantage of a smoothed formulation of DTW, called soft-DTW, that computes the soft-minimum of all alignment costs. We show in this paper that soft-DTW is a differentiable loss function, and that both its value and gradient can be computed with quadratic time/space complexity (DTW has quadratic time but linear space complexity). We show that this regularization is particularly well suited to average and cluster time series under the DTW geometry, a task for which our proposal significantly outperforms existing baselines. Next, we propose to tune the parameters of a machine that outputs time series by minimizing its fit with ground-truth labels in a soft-DTW sense.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot Image Classification | tieredImageNet | -- | 190 | |
| Few-shot classification | ImageNet mini | Accuracy96.96 | 92 | |
| Few-shot classification | Omniglot | Accuracy94.55 | 66 | |
| Few-shot classification | CUB-200 2011 | -- | 66 | |
| Few-shot Image Classification | StanfordCars | Accuracy0.7979 | 33 | |
| Time-series classification | UCR Archive (test) | Accuracy78 | 20 | |
| 5-way Image Classification | CIFAR-FS | Accuracy (1-shot)79.96 | 19 | |
| Image Classification | ImageNet mini | 1-Shot Accuracy85.82 | 14 | |
| Few-shot Action Recognition | NTU-60 | Accuracy (10 classes)53.7 | 11 | |
| Few-shot Action Recognition | NTU 120 | Accuracy (20 classes)30.3 | 11 |