Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

(Sparse) Attention to the Details: Preserving Spectral Fidelity in ML-based Weather Forecasting Models

About

We introduce Mosaic, a probabilistic weather forecasting model that addresses three failure modes of spectral degradation in ML-based weather prediction: spectral damping (statistical), high-frequency aliasing (architectural), and residual high-frequency leakage (parametric). Mosaic generates ensemble members through learned functional perturbations and operates on native-resolution grids via mesh-aligned block-sparse attention, a hardware-aligned mechanism that captures long-range dependencies at linear cost by sharing keys and values across spatially adjacent queries. At 1.5{\deg} resolution with 214M parameters, Mosaic matches or outperforms models trained on 6$\times$ finer resolution on key variables and achieves state-of-the-art results among 1.5{\deg} models, producing well-calibrated ensembles whose individual members exhibit near-perfect spectral alignment across all resolved frequencies. A 24-member, 10-day forecast takes under 12s on a single H100~GPU. Code is available at https://github.com/maxxxzdn/mosaic.

Maksim Zhdanov, Ana Lucic, Max Welling, Jan-Willem van de Meent• 2026

Related benchmarks

TaskDatasetResultRank
Global Weather Forecasting (240h lead-time)ERA5 2020 (test)
Z50036.78
16
Global Weather ForecastingWeatherBench 2
Time per Step0.048
8
Weather forecastingERA5 1.5° (2020 test)
Z500 RMSE624.1
7
Weather forecastingERA5 2020 (test)
RMSE T2M2.053
4
Showing 4 of 4 rows

Other info

Follow for update