When Large Language Models Do Not Work: Online Incivility Prediction through Graph Neural Networks

About

Online incivility has emerged as a widespread and persistent problem in digital communities, imposing substantial social and psychological burdens on users. Although many platforms attempt to curb incivility through moderation and automated detection, the performance of existing approaches often remains limited in both accuracy and efficiency. To address this challenge, we propose a Graph Neural Network (GNN) framework for detecting three types of uncivil behavior (i.e., toxicity, aggression, and personal attacks) within the English Wikipedia community. Our model represents each user comment as a node, with textual similarity between comments defining the edges, allowing the network to jointly learn from both linguistic content and relational structures among comments. We also introduce a dynamically adjusted attention mechanism that adaptively balances nodal and topological features during information aggregation. Empirical evaluations demonstrate that our proposed architecture outperforms 12 state-of-the-art Large Language Models (LLMs) across multiple metrics while requiring significantly lower inference cost. These findings highlight the crucial role of structural context in detecting online incivility and address the limitations of text-only LLM paradigms in behavioral prediction. All datasets and comparative outputs will be publicly available in our repository to support further research and reproducibility.

Zihan Chen, Lanyu Yu• 2025

Related benchmarks

Task	Dataset	Result
Aggression Detection	Wikipedia Detox	Accuracy90.4	13
Toxicity Detection	Wikipedia Detox	Accuracy91.3	13
Personal Attack Detection	Personal Attack Detection	Accuracy0.887	13

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord