SAS-Net: Cross-Domain Image Registration as Inverse Rendering via Structure-Appearance Factorization
About
Cross-domain image registration requires aligning images acquired under heterogeneous imaging physics, where the classical brightness constancy assumption is fundamentally violated. We formulate this problem through an image formation model I = R(s, a) + epsilon, where each observation is generated by a rendering function R acting on domain-invariant scene structure s and domain-specific appearance statistics a. Registration then reduces to an inverse rendering problem: given observations from two domains, recover the shared structure and re-render it under the target appearance to obtain the registered output. We instantiate this framework as SAS-Net (Scene-Appearance Separation Network), where instance normalization implements the structure-appearance decomposition and Adaptive Instance Normalization (AdaIN) realizes the differentiable forward renderer. A scene consistency loss enforces geometric correspondence in the factorized latent space. Experiments on EuroSAT-Reg-256 (satellite remote sensing) and FIRE-Reg-256 (retinal fundus) demonstrate state-of-the-art performance across heterogeneous imaging domains. SAS-Net (3.35M parameters) achieves 89 FPS on an RTX 5090 GPU. Code: https://github.com/D-ST-Sword/SAS-Net.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| OR-PAM Registration | OR-PAM-Reg 4K (test) | SSIM89.4 | 25 | |
| Intra-frame Image Registration | OR-PAM-Reg-Temporal-26K (test) | NCC0.994 | 18 | |
| Image Registration | OR-PAM | Time (ms)11.2 | 11 | |
| Image Registration | OR-PAM-Reg-Temporal 26K | TNCC0.967 | 9 | |
| Temporal consistency evaluation | OR-PAM-Reg-Temporal-26K (test) | TNCC0.967 | 9 |