Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FeatUp: A Model-Agnostic Framework for Features at Any Resolution

About

Deep features are a cornerstone of computer vision research, capturing image semantics and enabling the community to solve downstream tasks even in the zero- or few-shot regime. However, these features often lack the spatial resolution to directly perform dense prediction tasks like segmentation and depth prediction because models aggressively pool information over large areas. In this work, we introduce FeatUp, a task- and model-agnostic framework to restore lost spatial information in deep features. We introduce two variants of FeatUp: one that guides features with high-resolution signal in a single forward pass, and one that fits an implicit model to a single image to reconstruct features at any resolution. Both approaches use a multi-view consistency loss with deep analogies to NeRFs. Our features retain their original semantics and can be swapped into existing applications to yield resolution and performance gains even without re-training. We show that FeatUp significantly outperforms other feature upsampling and image super-resolution approaches in class activation map generation, transfer learning for segmentation and depth prediction, and end-to-end training for semantic segmentation.

Stephanie Fu, Mark Hamilton, Laura Brandt, Axel Feldman, Zhoutong Zhang, William T. Freeman• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU54.22
2888
Semantic segmentationADE20K
mIoU51.03
1024
Semantic segmentationCOCO Stuff
mIoU61.95
379
Semantic segmentationADE20K
mIoU38.82
366
Semantic segmentationPASCAL VOC (val)
mIoU83.52
362
Semantic segmentationPascal VOC
mIoU0.8337
180
Monocular Depth EstimationNYU V2
Delta 1 Acc91.93
131
Semantic segmentationPascal VOC
mIoU81.08
129
Semantic segmentationCOCO Stuff (val)
mIoU61.77
126
Semantic CorrespondenceSPair-71k (test)
PCK@0.129.11
125
Showing 10 of 34 rows

Other info

Code

Follow for update