SafeLens: Deliberate and Efficient Video Guardrails with Fast-and-Slow Screening

About

The rapid growth of online video platforms and AI-generated content has made reliable video guardrails a key challenge for safety and real-world deployment. While most videos can be screened through fast pattern recognition, a small subset requires deeper reasoning over temporally complex content and nuanced policy constraints. Existing approaches typically rely on large vision-language models applied uniformly across all inputs, resulting in high inference costs and inefficient allocation of computation. We propose SafeLens, a video guardrail framework that introduces a fast-and-slow inference architecture for efficient and accurate content moderation with variable computational cost across inputs. Additionally, we construct a high-quality dataset by applying influence-guided filtering to the SafeWatch Dataset, retaining only 2.4% of the original data. To further address limitations of training-time scaling, we enable test-time reasoning by augmenting the filtered data with structured Chain-of-Thought traces. Across real-world and AI-generated video benchmarks, SafeLens achieves state-of-the-art performance, outperforming strong open-source video guardrails (e.g., SafeWatch-8B, OmniGuard-7B) and closed-source models (e.g., GPT-5.4, Gemini-3.1-pro) while significantly reducing inference cost, demonstrating that efficient design serves to be more effective than scaling data or model size alone.

Shahriar Kabir Nahin, Hadi Askari, Muhao Chen, Anshuman Chhabra• 2026

Related benchmarks

Task	Dataset	Result	Rank
Content Moderation	SafeWatch-Real (val)	Sexual Accuracy87.9		14
Video Moderation	SafeWatch-GenAI (test)	Sexual Accuracy93.6		14

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord