Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Indirect Prompt Injection Attack Success Evaluation on LLM Behavior Goal-Adjacent
Loading...
100
IRany
GPT-5.5
75.04
81.52
88
94.48
May 14, 2026
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
Updated 16d ago
Evaluation Results
Method
Method
Links
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
GPT-5.5
2026.05
100
99
95
61.1
54
60.5
58
54
GPT-5.4
2026.05
99
99
94.9
44.7
42.4
43.8
42
42
Kimi-K2.6
2026.05
99
98
94.9
73.4
77.8
71.2
69
77
Gemini-3.1
2026.05
97
97.9
96.9
78.7
83.5
74.7
74
81
Sonnet-4.6
2026.05
76
98.7
88.2
79.1
72.4
59.3
53
55
Feedback
Search any
task
Search any
task