Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Indirect Prompt Injection Defense on HotpotQA

0.07Attack Success Rate (ASR)

Encode_Base64

-3.016817.819138.65559.4909Apr 29, 2025
Updated 22d ago

Evaluation Results

MethodLinks
2025.04
0.078.427.01
2025.04
1.7614.179.79
2025.04
3.054.243.19
2025.04
3.2412.749.78
2025.04
5.5114.3813.32
2025.04
6.2614.5612.94
2025.04
8.685.233.67
2025.04
11.3613.6811.11
2025.04
15.2316.2110.97
2025.04
17.0214.3412.01
2025.04
21.6714.147.69
2025.04
23.4513.6411.82
2025.04
25.614.110.12
2025.04
26.2316.1610.34
2025.04
40.1712.565.17
2025.04
67.2115.33.99
2025.04
69.0116.245.12
2025.04
77.2417.066.34