Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks

About

Autonomous UI agents powered by AI have tremendous potential to boost human productivity by automating routine tasks such as filing taxes and paying bills. However, a major challenge in unlocking their full potential is security, which is exacerbated by the agent's ability to take action on their user's behalf. Existing tests for prompt injections in web agents either over-simplify the threat by testing unrealistic scenarios or giving the attacker too much power, or look at single-step isolated tasks. To more accurately measure progress for secure web agents, we introduce WASP -- a new publicly available benchmark for end-to-end evaluation of Web Agent Security against Prompt injection attacks. Evaluating with WASP shows that even top-tier AI models, including those with advanced reasoning capabilities, can be deceived by simple, low-effort human-written injections in very realistic scenarios. Our end-to-end evaluation reveals a previously unobserved insight: while attacks partially succeed in up to 86% of the case, even state-of-the-art agents often struggle to fully complete the attacker goals -- highlighting the current state of security by incompetence.

Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori, Chuan Guo, Kamalika Chaudhuri• 2025

Related benchmarks

TaskDatasetResultRank
Instruction Injection Attack on Web Browser AgentGitLab Medium
UUA100
16
Instruction Injection Attack on Web Browser AgentGitLab Long
UUA91.67
16
Instruction Injection Attack on Web Browser AgentGitLab Short
UUA100
16
Instruction Injection Attack on Web Browser AgentReddit Medium
UUA100
8
Instruction Injection Attack on Web Browser AgentReddit Short
UUA100
8
Instruction Injection Attack on Web Browser AgentReddit Long
UUA55.56
8
Web Browser Agent HijackingReddit long-chain best-of-n sampling
UUA66.67
8
Web Browser Agent HijackingReddit medium-chain best-of-n sampling
UUA100
8
Web Browser Agent HijackingReddit short-chain best-of-n sampling
UUA100
8
Stealthiness EvaluationLong Web Browser
Dual-Goal Success Rate17.46
7
Showing 10 of 11 rows

Other info

Follow for update