Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

About

While goal-conditioned behavior cloning (GCBC) methods can perform well on in-distribution training tasks, they do not necessarily generalize zero-shot to tasks that require conditioning on novel state-goal pairs, i.e. combinatorial generalization. In part, this limitation can be attributed to a lack of temporal consistency in the state representation learned by BC; if temporally correlated states are properly encoded to similar latent representations, then the out-of-distribution gap for novel state-goal pairs would be reduced. We formalize this notion by demonstrating how encouraging long-range temporal consistency via successor representations (SR) can facilitate generalization. We then propose a simple yet effective representation learning objective, $\text{BYOL-}\gamma$ for GCBC, which theoretically approximates the successor representation in the finite MDP case through self-predictive representations, and achieves competitive empirical performance across a suite of challenging tasks requiring combinatorial generalization.

Daniel Lawson, Adriana Hugessen, Charlotte Cloutier, Glen Berseth, Khimya Khetarpal• 2025

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement LearningD4RL Franka Kitchen
Mixed Success Rate69
43
Goal-conditioned Reinforcement Learningantmaze stitch medium
Success Rate68
23
Goal-conditioned Reinforcement Learningantmaze stitch large
Success Rate26
23
Robotic ManipulationD4RL Kitchen-Partial
Normalized Score75
23
Goal-conditioned Reinforcement Learningantsoccer stitch arena
Success Rate25
14
Goal-conditioned Reinforcement Learningmanipulation scene-play
Success Rate17
14
Goal-conditioned Reinforcement Learninghumanoidmaze stitch medium
Success Rate51
14
Goal-conditioned Reinforcement Learninghumanoidmaze stitch large
Success Rate13
14
Robotic ManipulationD4RL Kitchen-Mixed--
14
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.51
11
Showing 10 of 39 rows

Other info

Follow for update