Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Gaussian Splatting Feature Fields for Privacy-Preserving Visual Localization

About

Visual localization is the task of estimating a camera pose in a known environment. In this paper, we utilize 3D Gaussian Splatting (3DGS)-based representations for accurate and privacy-preserving visual localization. We propose Gaussian Splatting Feature Fields (GSFFs), a scene representation for visual localization that combines an explicit geometry model (3DGS) with an implicit feature field. We leverage the dense geometric information and differentiable rasterization algorithm from 3DGS to learn robust feature representations grounded in 3D. In particular, we align a 3D scale-aware feature field and a 2D feature encoder in a common embedding space through a contrastive framework. Using a 3D structure-informed clustering procedure, we further regularize the representation learning and seamlessly convert the features to segmentations, which can be used for privacy-preserving visual localization. Pose refinement, which involves aligning either feature maps or segmentations from a query image with those rendered from the GSFFs scene representation, is used to achieve localization. The resulting privacy- and non-privacy-preserving localization pipelines, evaluated on multiple real-world datasets, show state-of-the-art performances.

Maxime Pietrantoni, Gabriela Csurka, Torsten Sattler• 2025

Related benchmarks

TaskDatasetResultRank
Visual Localization7Scenes
Median Translation Error (cm) - Chess0.4
66
Visual LocalizationCambridge Landmarks
King's Positional Error (cm)17
48
Visual Localization7Scenes Fire
Median Translation Error (cm)0.8
34
Visual Localization7Scenes Chess
Median Translation Error (cm)0.8
34
Visual Localization7Scenes (Office)
Median Translation Error (cm)1.5
34
Visual Localization7Scenes RedKitchen
Median Translation Error (cm)1.2
34
Visual Localization7Scenes Pumpkin
Median Translation Error (cm)2
34
Visual Localization7Scenes Heads
Median Translation Error (cm)1
34
Visual Localization12Scenes
Average Median Translation Error (cm)0.6
15
Visual LocalizationIndoor-6 v1 (scene2a)
MPE (m)0.06
12
Showing 10 of 17 rows

Other info

Follow for update