Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

To Protect the LLM Agent Against the Prompt Injection Attack with Polymorphic Prompt

About

LLM agents are widely used as agents for customer support, content generation, and code assistance. However, they are vulnerable to prompt injection attacks, where adversarial inputs manipulate the model's behavior. Traditional defenses like input sanitization, guard models, and guardrails are either cumbersome or ineffective. In this paper, we propose a novel, lightweight defense mechanism called Polymorphic Prompt Assembling (PPA), which protects against prompt injection with near-zero overhead. The approach is based on the insight that prompt injection requires guessing and breaking the structure of the system prompt. By dynamically varying the structure of system prompts, PPA prevents attackers from predicting the prompt structure, thereby enhancing security without compromising performance. We conducted experiments to evaluate the effectiveness of PPA against existing attacks and compared it with other defense methods.

Zhilong Wang, Neha Nagaraja, Lan Zhang, Hayretdin Bahsi, Pawan Patil, Peng Liu• 2025

Related benchmarks

TaskDatasetResultRank
Unsafe Instruction MitigationLibero Harm (test)
ASR41.2
3
Showing 1 of 1 rows

Other info

Follow for update