SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition

About

Learning long-range non-stationary temporal patterns remains a core challenge for modern sequence models, particularly in strict streaming settings. In these settings, data arrive sequentially and must be processed in a single pass without simultaneously revisiting past observations. Standard architectures, including recurrent neural networks and transformers, are constrained by either truncated backpropagation through time horizon or explicit input window length for long range credit assignment. To address these limitations, we propose SHARP (Sleep-based Hierarchical Accelerated Replay), a framework that decomposes temporal learning into two complementary components: a memory module that accumulates a structured history of past inputs, and a pattern-recognition module that operates over this memory. This separation enables resource- and compute-efficient adaptation to non-stationary dynamics by eliminating the need for backpropagation through time across many steps for long-range credit assignment. Inspired by the accelerated replay observed in rodents during slow-wave sleep, SHARP incorporates offline (sleep) phases in which temporally structured memory traces are replayed in an accelerated form and integrated into higher-level memory representations, improving long-range context retention. Through controlled simulations and ablation studies, we characterize the key properties of the proposed framework. In benchmark datasets such as text8 and PG-19, we demonstrate that SHARP improves over recurrent baselines by retaining next-token predictive performance on previously seen data while continuing to learn from the current stream and generalizing to future unseen data. These gains are enabled by its hierarchical structure, which yields an exponentially increasing effective temporal context with only linear-time computational cost.

Jayanta Dey, Shikhar Srivastava, Itamar Lerner, Christopher Kanan, Dhireesha Kudithipudi• 2026

Related benchmarks

Task	Dataset	Result
Language Modeling	PG-19	--	244
Character-level Language Modeling	text8 (held-out 1M tokens)	BPC2.3	14
Character-level Language Modeling	text8 (most recent 1M tokens)	BPC2.23	7
Character-level Language Modeling	text8 100M regime (Forward split)	Forward BPC2.36	7
Character-level Language Modeling	text8 100M regime (Current stream split)	Current BPC2.32	7
Character-level Language Modeling	text8 100M regime Backward stream	Backward BPC2.33	7
Language Modeling	PG-19 subword-level	Forward BPT4.26	6

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord