Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AgentSafe: Safeguarding Large Language Model-based Multi-agent Systems via Hierarchical Data Management

About

Large Language Model based multi-agent systems are revolutionizing autonomous communication and collaboration, yet they remain vulnerable to security threats like unauthorized access and data breaches. To address this, we introduce AgentSafe, a novel framework that enhances MAS security through hierarchical information management and memory protection. AgentSafe classifies information by security levels, restricting sensitive data access to authorized agents. AgentSafe incorporates two components: ThreatSieve, which secures communication by verifying information authority and preventing impersonation, and HierarCache, an adaptive memory management system that defends against unauthorized access and malicious poisoning, representing the first systematic defense for agent memory. Experiments across various LLMs show that AgentSafe significantly boosts system resilience, achieving defense success rates above 80% under adversarial conditions. Additionally, AgentSafe demonstrates scalability, maintaining robust performance as agent numbers and information complexity grow. Results underscore effectiveness of AgentSafe in securing MAS and its potential for real-world application.

Junyuan Mao, Fanci Meng, Yifan Duan, Miao Yu, Xiaojun Jia, Junfeng Fang, Yuxuan Liang, Kun Wang, Qingsong Wen• 2025

Related benchmarks

TaskDatasetResultRank
Targeted AttackInjecAgent
ASR@30.3
31
Malicious Advice DefensePoisonRAG
ASR@324.3
18
Prompt Injection DefenseGSM8K PI (Prompt Injection) (test)
ASR@13.7
16
Prompt Injection DefensePI (CSQA) random topology
ASR @144.6
16
Prompt InjectionMMLU random topology
ASR (k=1)24.5
16
Prompt Injection DefenseCSQA
ASR@355.6
16
Tool Attack DefenseInjecAgent random topology (test)
ASR@10.063
16
Showing 7 of 7 rows

Other info

Follow for update