Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

History-Aware Visuomotor Policy Learning via Point Tracking

About

Many manipulation tasks require memory beyond the current observation, yet most visuomotor policies rely on the Markov assumption and thus struggle with repeated states or long-horizon dependencies. Existing methods attempt to extend observation horizons but remain insufficient for diverse memory requirements. To this end, we propose an object-centric history representation based on point tracking, which abstracts past observations into a compact and structured form that retains only essential task-relevant information. Tracked points are encoded and aggregated at the object level, yielding a compact history representation that can be seamlessly integrated into various visuomotor policies. Our design provides full history-awareness with high computational efficiency, leading to improved overall task performance and decision accuracy. Through extensive evaluations on diverse manipulation tasks, we show that our method addresses multiple facets of memory requirements - such as task stage identification, spatial memorization, and action counting, as well as longer-term demands like continuous and pre-loaded memory - and consistently outperforms both Markovian baselines and prior history-based approaches. Project website: http://tonyfang.net/history

Jingjing Chen, Hongjie Fang, Chenxi Wang, Shiquan Wang, Cewu Lu• 2025

Related benchmarks

TaskDatasetResultRank
Add-Salt manipulationAdd-Salt
SR85
12
One-Move manipulationOne-Move
Success Rate95
12
Swap-Easy manipulationSwap-Easy
SR90
12
Three-Scoop manipulationThree-Scoop
Success Rate (SR)95
12
Swap-Hard manipulationSwap-Hard
SR80
9
Guess task with pre-loaded memoryGuess Easy
Success Rate95
2
Guess task with pre-loaded memoryGuess Hard
Success Rate85
2
Showing 7 of 7 rows

Other info

Follow for update