Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

From Path Signatures to Sequential Modeling: Incremental Signature Contributions for Offline RL

About

Path signatures embed trajectories into tensor algebra and constitute a universal, non-parametric representation of paths; however, in the standard form, they collapse temporal structure into a single global object, which limits their suitability for decision-making problems that require step-wise reactivity. We propose the Incremental Signature Contribution (ISC) method, which decomposes truncated path signatures into a temporally ordered sequence of elements in the tensor-algebra space, corresponding to incremental contributions induced by last path increments. This reconstruction preserves the algebraic structure and expressivity of signatures, while making their internal temporal evolution explicit, enabling processing signature-based representations via sequential modeling approaches. In contrast to full signatures, ISC is inherently sensitive to instantaneous trajectory updates, which is critical for sensitive and stability-requiring control dynamics. Building on this representation, we introduce ISC-Transformer (ISCT), an offline reinforcement learning model that integrates ISC into a standard Transformer architecture without further architectural modification. We evaluate ISCT on HalfCheetah, Walker2d, Hopper, and Maze2d, including settings with delayed rewards and downgraded datasets. The results demonstrate that ISC method provides a theoretically grounded and practically effective alternative to path processing for temporally sensitive control tasks.

Ziyi Zhao, Qingchuan Li, Yuxuan Xu• 2026

Related benchmarks

TaskDatasetResultRank
hopper locomotionD4RL hopper medium-replay
Normalized Score70.5
56
walker2d locomotionD4RL walker2d medium-replay
Normalized Score68.9
53
LocomotionD4RL walker2d-medium-expert
Normalized Score109.1
47
LocomotionD4RL Walker2d medium
Normalized Score79.8
44
LocomotionD4RL Halfcheetah medium
Normalized Score42.9
44
LocomotionD4RL halfcheetah-medium-expert
Normalized Score91.4
37
LocomotionD4RL HalfCheetah Medium-Replay
Normalized Score0.41
33
LocomotionD4RL hopper-medium-expert
Normalized Score (100k Steps)109.8
18
LocomotionD4RL Hopper medium
Normalized Score58.8
14
NavigationD4RL Maze2d-medium
Normalized Return86.8
9
Showing 10 of 14 rows

Other info

Follow for update