Memory-Efficient Optical Flow via Radius-Distribution Orthogonal Cost Volume
About
The full 4D cost volume in Recurrent All-Pairs Field Transforms (RAFT) or global matching by Transformer achieves impressive performance for optical flow estimation. However, their memory consumption increases quadratically with input resolution, rendering them impractical for high-resolution images. In this paper, we present MeFlow, a novel memory-efficient method for high-resolution optical flow estimation. The key of MeFlow is a recurrent local orthogonal cost volume representation, which decomposes the 2D search space dynamically into two 1D orthogonal spaces, enabling our method to scale effectively to very high-resolution inputs. To preserve essential information in the orthogonal space, we utilize self attention to propagate feature information from the 2D space to the orthogonal space. We further propose a radius-distribution multi-scale lookup strategy to model the correspondences of large displacements at a negligible cost. We verify the efficiency and effectiveness of our method on the challenging Sintel and KITTI benchmarks, and real-world 4K ($2160\!\times\!3840$) images. Our method achieves competitive performance on both Sintel and KITTI benchmarks, while maintaining the highest memory efficiency on high-resolution inputs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Optical Flow Estimation | KITTI 2015 (train) | Fl-epe5.31 | 431 | |
| Optical Flow Estimation | MPI Sintel Final (train) | Endpoint Error (EPE)2.75 | 209 | |
| Optical Flow Estimation | MPI Sintel Clean (train) | EPE1.49 | 202 | |
| Optical Flow | KITTI-15 (test) | Fl-all4.95 | 85 | |
| Optical Flow | Sintel Clean | EPE2.054 | 27 | |
| Optical Flow | Sintel Final | EPE3.09 | 27 |