Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models

About

Recent advances in Trajectory Optimization (TO) models have achieved remarkable success in offline reinforcement learning. However, their vulnerabilities against backdoor attacks are poorly understood. We find that existing backdoor attacks in reinforcement learning are based on reward manipulation, which are largely ineffective against the TO model due to its inherent sequence modeling nature. Moreover, the complexities introduced by high-dimensional action spaces further compound the challenge of action manipulation. To address these gaps, we propose TrojanTO, the first action-level backdoor attack against TO models. TrojanTO employs alternating training to enhance the connection between triggers and target actions for attack effectiveness. To improve attack stealth, it utilizes precise poisoning via trajectory filtering for normal performance and batch poisoning for trigger consistency. Extensive evaluations demonstrate that TrojanTO effectively implants backdoor attacks across diverse tasks and attack objectives with a low attack budget (0.3\% of trajectories). Furthermore, TrojanTO exhibits broad applicability to DT, GDT, and DC, underscoring its scalability across diverse TO model architectures.

Yang Dai, Oubo Ma, Longfei Zhang, Xingxing Liang, Xiaochun Cao, Shouling Ji, Jiaheng Zhang, Jincai Huang, Li Shen• 2025

Related benchmarks

TaskDatasetResultRank
Backdoor Attack on Offline RLD4RL Hopper v2 (offline)
ASR93.1
9
Backdoor Attack on Offline RLD4RL HalfCheetah
ASR100
9
Backdoor Attack on Offline RLD4RL walker2d
ASR99.5
9
Backdoor Attack on Offline RLD4RL kitchen
ASR96.9
9
Trojan Attack (Target action: '1')Hopper (Hopp)
Attack Success Rate (ASR)100
9
Trojan Attack (Target action: '1')Halfcheetah
ASR100
9
Trojan Attack (Target action: '1')Walker2d Walk
ASR100
9
Trojan Attack (Target action: '1')Ant
ASR100
9
Trojan Attack (Target action: 'arithmetic')Hopper (Hopp)
Attack Success Rate (ASR)92.7
9
Trojan Attack (Target action: 'arithmetic')Walker2d Walk
ASR100
9
Showing 10 of 24 rows

Other info

Follow for update