Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Indirect Prompt Injection Attack Success Evaluation on LLM Behavior Goal-Distant
Loading...
100
IRany
GPT-5.4
74
80.75
87.5
94.25
May 14, 2026
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
Updated 16d ago
Evaluation Results
Method
Method
Links
IRany
RRsem
RRemm
URemm
URctx
E2Esem
E2Eemm
E2Ectx
GPT-5.4
2026.05
100
51
6
16.7
0
8.5
1
0
GPT-5.5
2026.05
100
40
5
0
0
0
0
0
Kimi-K2.6
2026.05
100
41
7.9
14.3
5
5.9
1.1
5
Gemini-3.1
2026.05
92
42.4
4.3
0
4.3
0
0
4
Sonnet-4.6
2026.05
75
33.3
6.7
0
1.3
0
0
1
Feedback
Search any
task
Search any
task