Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FKA-Owl: Advancing Multimodal Fake News Detection through Knowledge-Augmented LVLMs

About

The massive generation of multimodal fake news involving both text and images exhibits substantial distribution discrepancies, prompting the need for generalized detectors. However, the insulated nature of training restricts the capability of classical detectors to obtain open-world facts. While Large Vision-Language Models (LVLMs) have encoded rich world knowledge, they are not inherently tailored for combating fake news and struggle to comprehend local forgery details. In this paper, we propose FKA-Owl, a novel framework that leverages forgery-specific knowledge to augment LVLMs, enabling them to reason about manipulations effectively. The augmented forgery-specific knowledge includes semantic correlation between text and images, and artifact trace in image manipulation. To inject these two kinds of knowledge into the LVLM, we design two specialized modules to establish their representations, respectively. The encoded knowledge embeddings are then incorporated into LVLMs. Extensive experiments on the public benchmark demonstrate that FKA-Owl achieves superior cross-domain performance compared to previous methods. Code is publicly available at https://liuxuannan.github.io/FKA_Owl.github.io/.

Xuannan Liu, Peipei Li, Huaibo Huang, Zekun Li, Xing Cui, Jiahao Liang, Lixiong Qin, Weihong Deng, Zhaofeng He• 2024

Related benchmarks

TaskDatasetResultRank
Multi-modal manipulation detectionROM NYT domain 1.0 (test)
Accuracy95.76
23
Multi-modal Forgery Detection and GroundingMDSM NYT
Accuracy94.67
14
Multi-modal Forgery Detection and GroundingMDSM Guardian
Accuracy92.6
14
Multi-modal Forgery Detection and GroundingMDSM USA
Accuracy80.9
14
Multi-modal Forgery Detection and GroundingMDSM BBC
Accuracy87.61
14
Multi-modal Forgery Detection and GroundingMDSM AVG
Accuracy84.12
14
Multi-modal Forgery Detection and GroundingMDSM (Wash.)
Accuracy78.88
14
Coarse-level Multimodal Misinformation DetectionMiRAGe News
Accuracy63.7
14
Coarse-level Multimodal Misinformation DetectionAMG
Accuracy70.7
14
Coarse-level Multimodal Misinformation DetectionMMFakeBench
Accuracy64.7
14
Showing 10 of 42 rows

Other info

Follow for update