Learning to Think in Physics: Breaking Shortcut Learning in Scientific Diffusion via Representation Alignment
About
Physics-informed diffusion models typically enforce PDE constraints only on final outputs, leaving intermediate representations unconstrained and prone to shortcut learning under shifted boundary conditions. We introduce **REPA-P**, a teacher-free, architecture-agnostic framework that aligns intermediate features with physical states using first-principles residuals. REPA-P attaches lightweight $1{\times}1$ projection heads to selected layers, decodes hidden activations into physical quantities, and applies PDE residual losses during training. These heads are discarded at inference, introducing **zero overhead**. Across four PDE tasks, including Darcy flow, topology optimization, electrostatic potential, and turbulent channel flow, REPA-P accelerates convergence by up to $2{\times}$, reduces physics residuals by up to $66.4\%$, and improves out-of-distribution robustness by up to $49.3\%$, with consistent gains on both U-Net and Diffusion Transformer backbones. Ablations show that supervising a small set of intermediate layers captures most benefits and complements output-level physics losses. Code is available at [https://github.com/Hxxxz0/REPA-P](https://github.com/Hxxxz0/REPA-P).
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generation | Darcy Flow | Data Fidelity Deviation1.19 | 10 | |
| Generation | Charge | Data Metric Value0.0081 | 10 | |
| Reconstruction | Darcy Flow | PSNR38.41 | 10 | |
| Reconstruction | Turbulence | PSNR39.95 | 10 | |
| Topology Optimization | Topology Optimization In-Distribution | CE (%)4.17 | 10 | |
| Topology Optimization | Topology Optimization Out-of-Distribution | CE (%)5.05 | 10 |