Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoupling Return-to-Go for Efficient Decision Transformer

About

The Decision Transformer (DT) has established a powerful sequence modeling approach to offline reinforcement learning. It conditions its action predictions on Return-to-Go (RTG), using it both to distinguish trajectory quality during training and to guide action generation at inference. In this work, we identify a critical redundancy in this design: feeding the entire sequence of RTGs into the Transformer is theoretically unnecessary, as only the most recent RTG affects action prediction. We show that this redundancy can impair DT's performance through experiments. To resolve this, we propose the Decoupled DT (DDT). DDT simplifies the architecture by processing only observation and action sequences through the Transformer, using the latest RTG to guide the action prediction. This streamlined approach not only improves performance but also reduces computational cost. Our experiments show that DDT significantly outperforms DT and establishes competitive performance against state-of-the-art DT variants across multiple offline RL tasks.

Yongyi Wang, Hanyu Liu, Lingfeng Li, Bozhou Chen, Ang Li, Qirui Zheng, Xionghui Yang, Wenxin Li• 2026

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement LearningD4RL halfcheetah v2 (medium-replay)
Normalized Score37.8
58
Offline Reinforcement LearningD4RL walker-medium-replay v2 (test)
Normalized Reward77.6
16
Offline Reinforcement LearningD4RL Walker-medium-expert v2
Normalized Return109.5
16
Offline Reinforcement LearningD4RL hopper-medium v2 (test)
Normalized Reward99.4
8
Offline Reinforcement LearningD4RL halfcheetah-medium-expert v2 (test)
Normalized Reward94.2
8
Offline Reinforcement LearningD4RL hopper-medium-expert v2 (test)
Normalized Reward1.11e+4
8
Offline Reinforcement LearningD4RL hopper-medium-replay v2 (test)
Normalized Reward9.25e+3
8
Offline Reinforcement LearningD4RL halfcheetah medium v2 (test)
Normalized Reward43
8
Showing 8 of 8 rows

Other info

Follow for update