Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AnyUp: Universal Feature Upsampling

About

We introduce AnyUp, a method for feature upsampling that can be applied to any vision feature at any resolution, without encoder-specific training. Existing learning-based upsamplers for features like DINO or CLIP need to be re-trained for every feature extractor and thus do not generalize to different feature types at inference time. In this work, we propose an inference-time feature-agnostic upsampling architecture to alleviate this limitation and improve upsampling quality. In our experiments, AnyUp sets a new state of the art for upsampled features, generalizes to different feature types, and preserves feature semantics while being efficient and easy to apply to a wide range of downstream tasks.

Thomas Wimmer, Prune Truong, Marie-Julie Rakotosaona, Michael Oechsle, Federico Tombari, Bernt Schiele, Jan Eric Lenssen• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU42.25
2731
Semantic segmentationPASCAL VOC (val)
mIoU84.33
338
Semantic segmentationCOCO Stuff
mIoU62.16
195
Semantic segmentationPascal VOC
mIoU0.84
172
Semantic segmentationCOCO Stuff (val)
mIoU62.08
126
Monocular Depth EstimationNYU V2
Delta 1 Acc92.33
113
Semantic segmentationVOC
mIoU84
44
Semantic segmentationADE20K
mIoU42.43
30
Surface Normal EstimationNYU V2
RMSE31.17
23
Depth EstimationCOCO (val)
δ161.32
9
Showing 10 of 12 rows

Other info

Follow for update