Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

About

The misuse of AI-driven video generation technologies has raised serious social concerns, highlighting the urgent need for reliable AI-generated video detectors. However, most existing methods are limited to binary classification and lack the necessary explanations for human interpretation. In this paper, we present Skyra, a specialized multimodal large language model (MLLM) that identifies human-perceivable visual artifacts in AI-generated videos and leverages them as grounded evidence for both detection and explanation. To support this objective, we construct ViF-CoT-4K for Supervised Fine-Tuning (SFT), which represents the first large-scale AI-generated video artifact dataset with fine-grained human annotations. We then develop a two-stage training strategy that systematically enhances our model's spatio-temporal artifact perception, explanation capability, and detection accuracy. To comprehensively evaluate Skyra, we introduce ViF-Bench, a benchmark comprising 3K high-quality samples generated by over ten state-of-the-art video generators. Extensive experiments demonstrate that Skyra surpasses existing methods across multiple benchmarks, while our evaluation yields valuable insights for advancing explainable AI-generated video detection.

Yifei Li, Wenzhao Zheng, Yanran Zhang, Runze Sun, Yu Zheng, Lei Chen, Jie Zhou, Jiwen Lu• 2025

Related benchmarks

TaskDatasetResultRank
Video Forgery DetectionGenVideo (test)
Recall (Average)87.66
21
Video Forgery DetectionVideo Datasets ID (In-Domain) GenBuster++, LOKI
GenBuster++ Score52.1
16
Video Forgery DetectionMintVid OOD
Fact Score51.9
16
Video Forgery DetectionOOD (Out-of-Domain) Video
Vidu Q137.7
16
Video Forgery DetectionID, OOD, and OOD-MintVid Aggregated
Average Score52.5
16
Video Forgery DetectionGenVideo
Sora Detection Rate0.9564
15
AI-generated Video DetectionViF-Bench T2V 1.0 (test)
Accuracy (Acc)91.02
13
AI-generated Video DetectionViF-Bench I2V 1.0 (test)
Accuracy91.02
7
AI-generated Video DetectionGenVideo ModelScope
Accuracy79.93
6
AI-generated Video DetectionGenVideo Morph Studio
Accuracy94.43
6
Showing 10 of 19 rows

Other info

GitHub

Follow for update