Federated Imputation under Heterogeneous Feature Spaces

About

Federated Learning (FL) enables collaborative training across decentralized clients, but most methods assume aligned feature schemas, an assumption that rarely holds in tabular settings where clients observe only partially overlapping feature subsets. In these heterogeneous feature spaces, parameter-averaging methods (e.g., FedAvg) transfer little information across weakly overlapping or disjoint feature groups, limiting their effectiveness for federated imputation. To overcome this, we propose \textbf{FedHF-Impute}, a federated imputation framework that separates structural feature unavailability from conventional missingness and uses a shared global feature graph to propagate information across statistically related features through message passing. This enables indirect cross-client knowledge transfer, even when features are never jointly observed locally, while preserving standard federated communication. Under simulated partial schema overlap on the SECOM and AirQuality datasets, FedHF-Impute improves imputation accuracy (RMSE) over FL baselines by 26.9\%, and 8.4\% respectively, while achieving comparable performance on PhysioNET, with only a 0.3\% difference relative to the best baseline.

Imane Hocine, Chaimaa Medjadji, Sylvain Kubler, Gregoire Danoy, Yves Le Traon• 2026

Related benchmarks

Task	Dataset	Result
Imputation	Air quality (test)	--	11
Data Imputation	SECOM (test)	RMSE0.8205	10
Data Imputation	Physionet (test)	RMSE0.9363	10

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord