PRISM: Parallel Residual Iterative Sequence Model
About
Generative sequence modeling faces a fundamental tension between the expressivity of Transformers and the efficiency of linear sequence models. Existing efficient architectures are theoretically bounded by shallow, single-step linear updates, while powerful iterative methods like Test-Time Training (TTT) break hardware parallelism due to state-dependent gradients. We propose PRISM (Parallel Residual Iterative Sequence Model) to resolve this tension. PRISM introduces a solver-inspired inductive bias that captures key structural properties of multi-step refinement in a parallelizable form. We employ a Write-Forget Decoupling strategy that isolates non-linearity within the injection operator. To bypass the serial dependency of explicit solvers, PRISM utilizes a two-stage proxy architecture: a short-convolution anchors the initial residual using local history energy, while a learned predictor estimates the refinement updates directly from the input. This design distills structural patterns associated with iterative correction into a parallelizable feedforward operator. Theoretically, we prove that this formulation achieves Rank-$L$ accumulation, structurally expanding the update manifold beyond the single-step Rank-$1$ bottleneck. Empirically, it achieves comparable performance to explicit optimization methods while achieving 174x higher throughput.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Sequential Recommendation | Books Amazon (test) | HR@2000.1258 | 20 | |
| Recommendation | Amazon Movies | Hit@5000.2407 | 12 | |
| Sequential Recommendation | Amazon movies (test) | Hit@20014.11 | 12 | |
| Recommendation | Amazon Books | Hit Rate@50023.83 | 12 | |
| Recommendation | Yelp | Hit Rate@5000.3204 | 12 | |
| Recommendation | Amazon elec | Hit Rate@50026.13 | 12 |