Learning Appearance-motion Normality for Video Anomaly Detection

About

Video anomaly detection is a challenging task in the computer vision community. Most single task-based methods do not consider the independence of unique spatial and temporal patterns, while two-stream structures lack the exploration of the correlations. In this paper, we propose spatial-temporal memories augmented two-stream auto-encoder framework, which learns the appearance normality and motion normality independently and explores the correlations via adversarial learning. Specifically, we first design two proxy tasks to train the two-stream structure to extract appearance and motion features in isolation. Then, the prototypical features are recorded in the corresponding spatial and temporal memory pools. Finally, the encoding-decoding network performs adversarial learning with the discriminator to explore the correlations between spatial and temporal patterns. Experimental results show that our framework outperforms the state-of-the-art methods, achieving AUCs of 98.1% and 89.8% on UCSD Ped2 and CUHK Avenue datasets.

Yang Liu, Jing Liu, Mengyang Zhao, Dingkang Yang, Xiaoguang Zhu, Liang Song• 2022

Related benchmarks

Task	Dataset	Result
Video Anomaly Detection	CUHK Avenue (test)	Frame-level AUC0.898	112
Video Anomaly Detection	UCSD Ped2 (test)	Frame-level AUC98.1	107
Video Anomaly Detection	ShanghaiTech (SHT) (test)	Frame-level AUC73.8	103

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord