Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GeoFormer: A Swin Transformer-Based Framework for Scene-Level Building Height and Footprint Estimation from Sentinel Imagery

About

Accurate three-dimensional urban data are critical for climate modelling, disaster risk assessment, and urban planning, yet remain scarce due to reliance on proprietary sensors or poor cross-city generalisation. We propose GeoFormer, an open-source Swin Transformer framework that jointly estimates building height (BH) and footprint (BF) on a 100 m grid using only Sentinel-1/2 imagery and open DEM data. A geo-blocked splitting strategy ensures strict spatial independence between training and test sets. Evaluated over 54 diverse cities, GeoFormer achieves a BH RMSE of 3.19 m and a BF RMSE of 0.05, improving 7.5% and 15.3% over the strongest CNN baseline, while maintaining under 3.5 m BH RMSE in cross-continent transfer. Ablation studies confirm that DEM is indispensable for height estimation and that optical reflectance dominates over SAR, though multi-source fusion yields the best overall accuracy. All code, weights, and global products are publicly released.

Han Jinzhen, JinByeong Lee, JiSung Kim, MinKyung Cho, DaHee Kim, HongSik Yun• 2026

Related benchmarks

TaskDatasetResultRank
Building Height EstimationSentinel--
6
Building Height EstimationSentinel + LiDAR--
2
Building Height EstimationSentinel + DEM
RMSE (m)3.19
1
Building Footprint EstimationSentinel--
1
Building Footprint EstimationVHR satellite image--
1
Building Height EstimationLiDAR--
1
Building Height EstimationVHR SAR--
1
Building Height EstimationGIS only--
1
Building Height EstimationVHR satellite image--
1
Showing 9 of 9 rows

Other info

Follow for update