Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

About

Recently, dataset distillation has paved the way towards efficient machine learning, especially for image datasets. However, the distillation for videos, characterized by an exclusive temporal dimension, remains an underexplored domain. In this work, we provide the first systematic study of video distillation and introduce a taxonomy to categorize temporal compression. Our investigation reveals that the temporal information is usually not well learned during distillation, and the temporal dimension of synthetic data contributes little. The observations motivate our unified framework of disentangling the dynamic and static information in the videos. It first distills the videos into still images as static memory and then compensates the dynamic and motion information with a learnable dynamic memory block. Our method achieves state-of-the-art on video datasets at different scales, with a notably smaller memory storage budget. Our code is available at https://github.com/yuz1wan/video_distillation.

Ziyu Wang, Yue Xu, Cewu Lu, Yong-Lu Li• 2023

Related benchmarks

Task	Dataset	Result
Action Recognition	Kinetics-400	--	498
Video Action Recognition	HMDB51	Top-1 Accuracy8.2	130
Video Classification	MiniUCF 112x112 (test)	Accuracy27.2	19
Video Classification	HMDB51 112x112 (test)	Accuracy8.2	19
Video Classification	UCF Mini	Top-1 Acc27.2	18
Video Action Recognition	UCF Mini	Storage (MB)94	17
Video Action Recognition	HMDB51	Storage (MB)94	17
Action Recognition	SS v2	Top-5 Accuracy4	13
Action Retrieval	HMDB51 (1 VPC)	Recall@122.61	2

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord