Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Prompt Injection Defense on VPI-Bench
Loading...
96.5
ASR (Amazon)
None
-3.86
22.195
48.25
74.305
Apr 14, 2026
ASR (Amazon)
ASR (Booking)
ASR (BBC)
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR (Amazon)
ASR (Booking)
ASR (BBC)
None
Agent Framework=Browse...
2026.04
96.5
84.2
84.2
System Prompt
Agent Framework=Browse...
2026.04
92.98
85.96
85.96
System Prompt
Agent Framework=Claude...
2026.04
42.2
37.8
5.6
None
Agent Framework=Claude...
2026.04
31.7
36.7
16.7
Guard-gpt-4o
Agent Framework=Browse...
2026.04
22.8
15.8
21.1
Guard-gpt-4o
Agent Framework=Claude...
2026.04
10.6
12.2
4.4
WebAgentGuard-8B
Agent Framework=Claude...
2026.04
1.7
0
0
WebAgentGuard-4B
Agent Framework=Claude...
2026.04
0
0.6
0
WebAgentGuard-8B
Agent Framework=Browse...
2026.04
0
0
0
WebAgentGuard-4B
Agent Framework=Browse...
2026.04
0
1.8
0
Feedback
Search any
task
Search any
task