Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following

About

Effective task representations should facilitate compositionality, such that after learning a variety of basic tasks, an agent can perform compound tasks consisting of multiple steps simply by composing the representations of the constituent steps together. While this is conceptually simple and appealing, it is not clear how to automatically learn representations that enable this sort of compositionality. We show that learning to associate the representations of current and future states with a temporal alignment loss can improve compositional generalization, even in the absence of any explicit subtask planning or reinforcement learning. We evaluate our approach across diverse robotic manipulation tasks as well as in simulation, showing substantial improvements for tasks specified with either language or goal images.

Vivek Myers, Bill Chunyuan Zheng, Anca Dragan, Kuan Fang, Sergey Levine• 2025

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement LearningD4RL Franka Kitchen
Mixed Success Rate85
43
Robotic ManipulationD4RL Kitchen-Partial
Normalized Score100
23
Goal-conditioned Reinforcement Learningantmaze stitch medium
Success Rate54
23
Goal-conditioned Reinforcement Learningantmaze stitch large
Success Rate17
23
Goal-conditioned Reinforcement Learningmanipulation scene-play
Success Rate16
14
Goal-conditioned Reinforcement Learninghumanoidmaze stitch medium
Success Rate45
14
Goal-conditioned Reinforcement Learninghumanoidmaze stitch large
Success Rate5
14
Goal-conditioned Reinforcement Learningantsoccer stitch arena
Success Rate14
14
Robotic ManipulationD4RL Kitchen-Mixed--
14
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.4
11
Showing 10 of 25 rows

Other info

Follow for update