Learning Event Completeness for Weakly Supervised Video Anomaly Detection

About

Weakly supervised video anomaly detection (WS-VAD) is tasked with pinpointing temporal intervals containing anomalous events within untrimmed videos, utilizing only video-level annotations. However, a significant challenge arises due to the absence of dense frame-level annotations, often leading to incomplete localization in existing WS-VAD methods. To address this issue, we present a novel LEC-VAD, Learning Event Completeness for Weakly Supervised Video Anomaly Detection, which features a dual structure designed to encode both category-aware and category-agnostic semantics between vision and language. Within LEC-VAD, we devise semantic regularities that leverage an anomaly-aware Gaussian mixture to learn precise event boundaries, thereby yielding more complete event instances. Besides, we develop a novel memory bank-based prototype learning mechanism to enrich concise text descriptions associated with anomaly-event categories. This innovation bolsters the text's expressiveness, which is crucial for advancing WS-VAD. Our LEC-VAD demonstrates remarkable advancements over the current state-of-the-art methods on two benchmark datasets XD-Violence and UCF-Crime.

Yu Wang, Shiwei Chen• 2025

Related benchmarks

Task	Dataset	Result
Video Anomaly Detection	UCF-Crime	AUC89.97	263
Video Anomaly Detection	XD-Violence (test)	AP88.47	164
Fine-grained Video Anomaly Detection	UCF-Crime	mAP@IoU 0.119.65	7

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord