Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LAVA: Layered Audio-Visual Anti-tampering Watermarking for Robust Deepfake Detection and Localization

About

Proactive watermarking offers a promising approach for deepfake tamper detection and localization in short-form videos. However, existing methods often decouple audio and visual evidence and assume that watermark signals remain reliable under real-world degradations, making tamper localization vulnerable to multimodal misalignment and compression distortions. Moreover, existing semi-fragile visual watermarking methods often degrade significantly under codec compression because their embedding bands overlap with compression-sensitive frequency regions. To address these limitations, we propose Layered Audio-Visual Anti-tampering Watermarking (LAVA), a calibration-aware audio-visual watermark fusion framework for deepfake tamper detection and localization. LAVA leverages cross-modal watermark fusion and calibration-aware alignment to preserve consistent and reliable tamper evidence under compression and audio-visual asynchrony, enabling robust tamper localization. Extensive experiments demonstrate that LAVA achieves near-perfect detection performance (AP = 0.999), remains robust to compression and multimodal misalignment, and significantly improves tamper localization reliability over existing audio-visual fusion baselines.

Bokang Zeng, Zheng Gao, Xiaoyu Li, Xiaoyan Feng, Jiaojiao Jiang• 2026

Related benchmarks

TaskDatasetResultRank
Deepfake DetectionFakeAVCeleb--
9
Frame-level Deepfake DetectionVoxCeleb2
AP99.1
8
Frame-level Deepfake DetectionLAV-DF
AP100
8
Deepfake DetectionLAV-DF clean
AP100
7
Deepfake LocalizationLAV-DF
IoU95.5
4
Tamper DetectionLAV-DF (50 groups)
AP99.9
4
Showing 6 of 6 rows

Other info

Follow for update