Share your thoughts, 1 month free Claude Pro on usSee more

Stealthiness Evaluation on Harmful prompts (evaluated on 3 LLMs and 4 guard LLMs)

3.23Mean Perplexity

ArtPrompt

Updated 2mo ago

Evaluation Results

Method	Links
ArtPrompt 2024.10		3.23	1.89
Ascii 2024.10		4.13	0.49
base64 2024.10		10.46	3.06
Morse Cipher 2024.10		11.81	2.19
ReNeLLM 2024.10		15.56	5.69
Unicode 2024.10		42.19	33.57
UTF-8 2024.10		42.19	33.57
Origin 2024.10		49.9	51.63
Caesar Cipher 2024.10		258.1	182.96
FlipAttack 2024.10		809.67	506.4