PRISM: Parallel Residual Iterative Sequence Model

About

Generative sequence modeling faces a fundamental tension between the expressivity of Transformers and the efficiency of linear sequence models. Existing efficient architectures are theoretically bounded by shallow, single-step linear updates, while powerful iterative methods like Test-Time Training (TTT) break hardware parallelism due to state-dependent gradients. We propose PRISM (Parallel Residual Iterative Sequence Model) to resolve this tension. PRISM introduces a solver-inspired inductive bias that captures key structural properties of multi-step refinement in a parallelizable form. We employ a Write-Forget Decoupling strategy that isolates non-linearity within the injection operator. To bypass the serial dependency of explicit solvers, PRISM utilizes a two-stage proxy architecture: a short-convolution anchors the initial residual using local history energy, while a learned predictor estimates the refinement updates directly from the input. This design distills structural patterns associated with iterative correction into a parallelizable feedforward operator. Theoretically, we prove that this formulation achieves Rank-$L$ accumulation, structurally expanding the update manifold beyond the single-step Rank-$1$ bottleneck. Empirically, it achieves comparable performance to explicit optimization methods while achieving 174x higher throughput.

Jie Jiang, Ke Cheng, Xin Xu, Mengyang Pang, Tianhao Lu, Jiaheng Li, Yue Liu, Yuan Wang, Jun Zhang, Huan Yu, Zhouchen Lin• 2026

Related benchmarks

Task	Dataset	Result
Sequential Recommendation	Books Amazon (test)	HR@2000.1258	20
Recommendation	Amazon Movies	Hit@5000.2407	12
Sequential Recommendation	Amazon movies (test)	Hit@20014.11	12
Recommendation	Amazon Books	Hit Rate@50023.83	12
Recommendation	Yelp	Hit Rate@5000.3204	12
Recommendation	Amazon elec	Hit Rate@50026.13	12

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord