Leaping Into Memories: Space-Time Deep Feature Synthesis

About

The success of deep learning models has led to their adaptation and adoption by prominent video understanding methods. The majority of these approaches encode features in a joint space-time modality for which the inner workings and learned representations are difficult to visually interpret. We propose LEArned Preconscious Synthesis (LEAPS), an architecture-independent method for synthesizing videos from the internal spatiotemporal representations of models. Using a stimulus video and a target class, we prime a fixed space-time model and iteratively optimize a video initialized with random noise. Additional regularizers are used to improve the feature diversity of the synthesized videos alongside the cross-frame temporal coherence of motions. We quantitatively and qualitatively evaluate the applicability of LEAPS by inverting a range of spatiotemporal convolutional and attention-based architectures trained on Kinetics-400, which to the best of our knowledge has not been previously accomplished.

Alexandros Stergiou, Nikos Deligiannis• 2023

Related benchmarks

Task	Dataset	Result
Embedding Similarity	VidChapters7M	Cosine Similarity6.45	6
Feature Visualization	VidChapters7M	FVD2.34e+3	3
Visual Explanation Generation	VidChapters7M (test)	FVD2.18e+3	3

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord