MSPT: Efficient Large-Scale Physical Modeling via Parallelized Multi-Scale Attention

About

A key scalability challenge in neural solvers for industrial-scale physics simulations is efficiently capturing both fine-grained local interactions and long-range global dependencies across millions of spatial elements. We introduce the Multi-Scale Patch Transformer (MSPT), an architecture that combines local point attention within patches with global attention to coarse patch-level representations. To partition the input domain into spatially-coherent patches, we employ ball trees, which handle irregular geometries efficiently. This dual-scale design enables MSPT to scale to millions of points on a single GPU. We validate our method on standard PDE benchmarks (elasticity, plasticity, fluid dynamics, porous flow) and large-scale aerodynamic datasets (ShapeNet-Car, Ahmed-ML), achieving state-of-the-art accuracy with substantially lower memory footprint and computational cost.

Pedro M. P. Curvo, Jan-Willem van de Meent, Maksim Zhdanov• 2025

Related benchmarks

Task	Dataset	Result
Operator learning	Airfoil Structured Mesh (test)	Relative L2 Error0.0051	15
Operator learning	Pipe Structured Mesh (test)	Relative L2 Error0.0031	15
Operator learning	Navier-Stokes Regular Grid (test)	Relative L2 Error0.0632	15
CFD field reconstruction	ShapeNet Car (test)	Volume Error1.89	15
Operator learning	Plasticity Structured Mesh (test)	Relative L2 Error0.001	15
Operator learning	Darcy Regular Grid (test)	Relative L2 Error0.0063	15
Operator learning	Elasticity Point Cloud (test)	Relative L2 Error0.0048	13
CFD field reconstruction	AhmedML (test)	Volume Metric2.04	11

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord