Robust Image Stitching with Optimal Plane
About
We present \textit{RopStitch}, an unsupervised deep image stitching framework with both robustness and naturalness. To ensure the robustness of \textit{RopStitch}, we propose to incorporate the universal prior of content perception into the image stitching model by a dual-branch architecture. It separately captures coarse and fine features and integrates them to achieve highly generalizable performance across diverse unseen real-world scenes. Concretely, the dual-branch model consists of a pretrained branch to capture semantically invariant representations and a learnable branch to extract fine-grained discriminative features, which are then merged into a whole by a controllable factor at the correlation level. Besides, considering that content alignment and structural preservation are often contradictory to each other, we propose a concept of virtual optimal planes to relieve this conflict. To this end, we model this problem as a process of estimating homography decomposition coefficients, and design an iterative coefficient predictor and minimal semantic distortion constraint to identify the optimal plane. This scheme is finally incorporated into \textit{RopStitch} by warping both views onto the optimal plane bidirectionally. Extensive experiments across various datasets demonstrate that \textit{RopStitch} significantly outperforms existing methods, particularly in scene robustness and content naturalness. The code is available at {\color{red}https://github.com/MmelodYy/RopStitch}.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Stitching | UDIS-D (test) | mPSNR (Easy)29.93 | 17 | |
| Image Stitching | classical datasets | mPSNR (Easy)25.4 | 11 | |
| Image Stitching | Classical Datasets Easy | mPSNR25.4 | 9 | |
| Image Stitching | Classical Datasets Moderate | mPSNR19.79 | 9 | |
| Image Stitching | Classical Datasets Hard | mPSNR15.48 | 9 | |
| Image Stitching | Classical Datasets Average | mPSNR19.74 | 9 | |
| Image Stitching | Cat (test) | Inference Time (s)0.0389 | 7 | |
| Image Stitching | Reception (test) | Inference Time (s)0.0868 | 7 | |
| Image Stitching | Construction site (test) | Inference Time (s)0.1566 | 7 |