Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Backdoor Attack on GPT-5-generated shopping requests
Loading...
91.2
ASR
BadDLM
-0.84
23.055
46.95
70.845
May 10, 2026
ASR
Utility
Updated 22d ago
Evaluation Results
Method
Method
Links
ASR
Utility
BadDLM
Model=LLaDA-8B-Instruct
2026.05
91.2
65.4
RL-based
Model=LLaDA-8B-Instruct
2026.05
72.8
64.5
VPI
Model=LLaDA-8B-Instruct
2026.05
41.2
65.5
SFT-based
Model=LLaDA-8B-Instruct
2026.05
38.5
65.4
Benign (No Attack)
Model=LLaDA-8B-Instruct
2026.05
2.7
65.5
Feedback
Search any
task
Search any
task