Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AOT-POT: Adaptive Operator Transformation for Large-Scale PDE Pre-training

About

Pre-training neural operators on diverse partial differential equation (PDE) datasets has emerged as a promising direction for building general-purpose surrogate models in scientific machine learning. However, the inherent complexity and structural diversity of PDE solution operators make multi-PDE pre-training fundamentally challenging. Existing methods mainly address this by increasing model capacity, while leaving the target solution operators unchanged. Inspired by classical numerical analysis, we instead propose to transform complex and diverse solution operators into simpler, better-aligned forms that are easier to model jointly. Since the optimal transformation varies across PDE types, it must be adaptive and input-dependent, allowing a single neural operator to approximate an entire family of operators. We instantiate this idea as AOT-POT (adaptive operator-transformation for pre-training operator transformer), which expands hidden representations into multiple parallel streams, adaptively aggregates and redistributes them before and after each sub-layer, and mixes streams through Sinkhorn-projected doubly stochastic matrices for stable training. These mechanisms together reshape diverse solution operators into a unified form that can be effectively modeled by a single architecture. Empirically, AOT-POT achieves state-of-the-art performance on 12 PDE benchmarks with only 3\% additional parameters, reducing relative L2 error by up to 77.6\% (40.9\% on average). Fine-tuning AOT-POT further reduces L2 error by up to 92\% on in-domain PDEs and 89\% on out-of-domain PDEs (unseen types during pre-training), demonstrating that adaptive operator transformation is an effective and complementary direction for advancing PDE foundation models beyond simply scaling model capacity.

Qitan Lv, Hong Wang, Zhongkai Hao, Wen Wu, Xuenan Xu, Bowen Zhou, Feng Wu, Chao Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Operator learningPDEBench DR
L2RE0.0064
28
Operator learningPDEBench SWE
L2 Relative Error (L2RE)7.20e-4
28
Operator learningFNO-ν 1e-5
L2 Relative Error1.45
25
Operator learningFNO-ν (1e-4)
L2RE0.0061
25
Operator learningFNO-ν 1e-3
L2RE0.0018
25
Operator learningPDEBench CNS (η=1, ζ=0.1)
L2RE0.0043
25
Operator learningPDEBench CNS (η=1, ζ=0.01)
L2RE0.491
25
Operator learningPDEBench CNS (η=0.1, ζ=0.1)
L2 Relative Error (L2RE)0.0079
25
Operator learningPDEBench CNS (η=0.1, ζ=0.01)
L2 Relative Error0.0032
25
Operator learningPDEArena NS
L2RE2.36
25
Showing 10 of 37 rows

Other info

Follow for update