Lightweight Stylistic Consistency Profiling: Robust Detection of LLM-Generated Textual Content for Multimedia Moderation
About
The increasing prevalence of Large Language Models (LLMs) in content creation has made distinguishing human-written textual content from LLM-generated counterparts a critical task for multimedia moderation. Existing detectors often rely on statistical cues or model-specific heuristics, making them vulnerable to paraphrasing and adversarial manipulations, and consequently limiting their robustness and interpretability. In this work, we proposeLiSCP , a novel lightweight stylistic consistency profiling method for robust detection of LLM-generated textual content, focusing on feature stability under adversarial manipulation. Our approach constructs a consistency profile that combines discrete stylistic features with continuous semantic signals, leveraging stylistic stability across multimodal-guided paraphrased text variants. Experiments spanning real-world multimedia news and movie datasets and conventional text domains demonstrate that LiSCP achieves superior performance on in-domain detection and outperforms existing approaches by up to 11.79% in cross-domain settings. Additionally,it demonstrates notable robustness under adversarial scenarios, including adversarial attacks and hybrid human-AI settings.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| LLM-generated content detection | Student Essay IvyPanda | AUROC0.9455 | 11 | |
| LLM-generated content detection | Yelp Review | AUROC0.8718 | 11 | |
| LLM-generated content detection | VisualNews | AUROC0.9746 | 11 | |
| LLM-generated content detection | MM-IMDB | AUROC95.76 | 11 | |
| Machine-generated text detection | Paper Abstract (test) | F1 Score91.54 | 11 | |
| LLM-generated content detection | HumanEval | AUROC0.8108 | 11 | |
| LLM-generated content detection | Reuter News 50_50 | AUROC0.9356 | 11 | |
| Machine-generated text detection | HumanEval (test) | F1 Score83.33 | 3 | |
| Machine-generated text detection | Student Essay (test) | F1 Score89.27 | 3 | |
| Machine-generated text detection | Yelp Review (test) | F1 Score0.8083 | 3 |