Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PropGuard: Safeguarding LLM-MAS via Propagation-Aware Exploration and Remediation

About

LLM-based multi-agent systems (LLM-MAS) have become a promising paradigm for solving complex tasks through role specialization, tool use, memory, and collaborative reasoning. However, these interactions create new security risks that malicious instructions injected through messages, tools, or memories can propagate across agents and rounds, causing system-level compromise. Existing defenses largely rely on local filtering or graph-based anomaly detection, but they often fail to trace fine-grained propagation paths or remediate contaminated states without disrupting benign collaboration. We propose PropGuard, a propagation-aware framework for safeguarding LLM-MAS. PropGuard constructs a dual-view spatio-temporal graph that combines response-centric risk estimation with full-state evidence preservation. Guided by these risk priors, a GE-GRPO trained inspector sequentially explores the full-state graph to recover compact suspicious propagation subgraphs. PropGuard then verifies harmful propagation through subgraph-aware diagnosis and applies source-guided remediation to correct upstream contamination and replay affected downstream interactions. Experiments across four communication architectures and five attack settings demonstrate that PropGuard consistently lowers attack success while maintaining high task-level defense success, achieving a favorable effectiveness--efficiency trade-off.

Bingyu Yan, Xiaoming Zhang, Jinyu Hou, Chaozhuo Li, Ziyi Zhou, Xiaozhe Zhang, Litian Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Prompt InjectionMMLU
ASR@312
91
Malicious Advice DefensePoisonRAG
ASR5
36
Prompt InjectionMATH
Attack Success Rate (ASR)6
36
Prompt InjectionCSQA
ASR21
36
Trojan AttackInjecAgent
ASR9
36
Prompt InjectionMMLU random topology--
16
Memory Attack DefensePoisonRAG random architecture
ASR7.7
6
Prompt Injection DefenseCSQA random architecture
ASR18.3
6
Prompt Injection DefenseMATH random architecture
ASR6
6
Tool Attack DefenseInjecAgent random architecture
ASR7.3
6
Showing 10 of 10 rows

Other info

Follow for update