Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Indirect Prompt Injection Attack Success Evaluation on Agent Action Goal-Distant 2
Loading...
88
IRany
GPT-5.5
3.76
25.63
47.5
69.37
May 14, 2026
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
Updated 16d ago
Evaluation Results
Method
Method
Links
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
GPT-5.5
2026.05
88
100
19.3
11.8
12.5
10.4
2
11
GPT-5.4
2026.05
87
100
14.9
30.8
13.8
26.8
4
12
Kimi-K2.6
2026.05
87
100
17.2
20
13.8
17.4
3
12
Gemini-3.1
2026.05
69
100
17.4
41.7
11.6
28.8
5
8
Sonnet-4.6
2026.05
7
100
0
-
0
-
0
0
Feedback
Search any
task
Search any
task