Anomaly Detection in Video Sequence with Appearance-Motion Correspondence
About
Anomaly detection in surveillance videos is currently a challenge because of the diversity of possible events. We propose a deep convolutional neural network (CNN) that addresses this problem by learning a correspondence between common object appearances (e.g. pedestrian, background, tree, etc.) and their associated motions. Our model is designed as a combination of a reconstruction network and an image translation model that share the same encoder. The former sub-network determines the most significant structures that appear in video frames and the latter one attempts to associate motion templates to such structures. The training stage is performed using only videos of normal events and the model is then capable to estimate frame-level scores for an unknown input. The experiments on 6 benchmark datasets demonstrate the competitive performance of the proposed approach with respect to state-of-the-art methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Video Anomaly Detection | CUHK Avenue (Ave) (test) | AUC86.9 | 203 | |
| Abnormal Event Detection | UCSD Ped2 (test) | AUC96.2 | 146 | |
| Abnormal Event Detection | UCSD Ped2 | AUC96.2 | 132 | |
| Video Anomaly Detection | Avenue (test) | AUC (Micro)86.9 | 85 | |
| Video Anomaly Detection | CUHK Avenue | Frame AUC86.9 | 65 | |
| Anomaly Detection | Avenue | Frame AUC (Micro)86.9 | 55 | |
| Abnormal Event Detection | Avenue (test) | -- | 37 | |
| Video Anomaly Detection | CUHK Avenue (test) | Frame-level AUC0.869 | 35 | |
| Video Anomaly Detection | UCSD Ped2 (test) | Frame-level AUC96.2 | 35 | |
| Anomaly Detection | Avenue | AUC0.869 | 30 |