2K Retrofit: Entropy-Guided Efficient Sparse Refinement for High-Resolution 3D Geometry Prediction
About
High-resolution geometric prediction is essential for robust perception in autonomous driving, robotics, and AR/MR, but current foundation models are fundamentally limited by their scalability to real-world, high-resolution scenarios. Direct inference on 2K images with these models incurs prohibitive computational and memory demands, making practical deployment challenging. To tackle the issue, we present 2K Retrofit, a novel framework that enables efficient 2K-resolution inference for any geometric foundation model, without modifying or retraining the backbone. Our approach leverages fast coarse predictions and an entropy-based sparse refinement to selectively enhance high-uncertainty regions, achieving precise and high-fidelity 2K outputs with minimal overhead. Extensive experiments on widely used benchmark demonstrate that 2K Retrofit consistently achieves state-of-the-art accuracy and speed, bridging the gap between research advances and scalable deployment in high-resolution 3D vision applications. Code will be released upon acceptance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Point Map Estimation | ETH3D | -- | 31 | |
| Monocular Depth Estimation | ScanNet++ (test) | RMSE0.0774 | 20 | |
| Monocular Depth Estimation | ARKitScenes (test) | AbsRel1.18 | 11 | |
| Monocular Depth Estimation | ETH3D 71 | AbsRel0.0192 | 8 | |
| Dense Multi-View Stereo Estimation | ETH3D | Precision84.01 | 5 |