Panda: A pretrained forecast model for chaotic dynamics
About
Chaotic systems are intrinsically sensitive to small errors, challenging efforts to construct predictive data-driven models of real-world dynamical systems such as fluid flows or neuronal activity. Prior efforts comprise either specialized models trained on individual time series, or foundation models trained on vast time series databases with little underlying dynamical structure. Motivated by dynamical systems theory, we present Panda, Patched Attention for Nonlinear DynAmics. We train Panda on a novel synthetic, extensible dataset of $2 \times 10^4$ chaotic dynamical systems that we discover using an evolutionary algorithm. Trained purely on simulated data, Panda exhibits emergent properties: zero-shot forecasting of unseen chaotic systems preserving both short-term accuracy and distributional measures, nonlinear resonance patterns in attention heads, and effective prediction of real-world experimental time series. Despite having been trained only on low-dimensional ordinary differential equations, Panda spontaneously develops the ability to predict partial differential equations without retraining. We also demonstrate a neural scaling law for differential equations, underscoring the potential of pre-trained models for probing abstract mathematical domains like nonlinear dynamics.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Long-term forecasting | Systems long-term trajectories of length 4096 (test) | Avg Spectral Hellinger Distance (H^2)0.25 | 25 | |
| Distributional Prediction | systems (test) | KL Divergence2.82 | 25 | |
| Time Series Forecasting | systems (test) | Per-system Difference in KL Divergence0.14 | 20 | |
| Time-series Distributional Prediction | systems (test) | Spectral Hellinger Distance (Per-System)0.04 | 20 | |
| Long-horizon forecasting | Held-out systems | MAE0.35 | 18 | |
| Time Series Forecasting | Held-out systems (test) | sMAPE (Median)27.6 | 18 | |
| Time Series Forecasting | Multivariate Time Series L=128 | -- | 7 | |
| Time Series Forecasting | Multivariate Time Series L=512 | -- | 7 | |
| Long-term forecasting | Physical Systems L_pred=512 (test) | KL Divergence3.93 | 5 | |
| Long-term forecasting | Physical Systems L_pred=1024 (test) | KL Divergence (DKL)4.72 | 5 |