Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Backdoor Attack on shopping requests GPT-5-generated (test)
Loading...
94.1
Attack Success Rate (ASR)
BadDLM
2.268
26.109
49.95
73.791
May 10, 2026
Attack Success Rate (ASR)
Updated 22d ago
Evaluation Results
Method
Method
Links
Attack Success Rate (ASR)
BadDLM
Method=BadDLM (Ours),...
2026.05
94.1
RL-based
Method=RL-based, Base...
2026.05
69.6
VPI
Method=VPI, Base Model...
2026.05
48.8
SFT-based
Method=SFT-based, Base...
2026.05
45.1
Benign (No Attack)
Method=Benign (No Atta...
2026.05
5.8
Feedback
Search any
task
Search any
task