Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Bridged Clustering: Semi-Supervised Sparse Bridging

About

We introduce Bridged Clustering, a semi-supervised framework to learn predictors from any unpaired input $X$ and output $Y$ dataset. Our method first clusters $X$ and $Y$ independently, then learns a sparse, interpretable bridge between clusters using only a few paired examples. At inference, a new input $x$ is assigned to its nearest input cluster, and the centroid of the linked output cluster is returned as the prediction $\hat{y}$. Unlike traditional SSL, Bridged Clustering explicitly leverages output-only data, and unlike dense transport-based methods, it maintains a sparse and interpretable alignment. Through theoretical analysis, we show that with bounded mis-clustering and mis-bridging rates, our algorithm becomes an effective and efficient predictor. Empirically, our method is competitive with SOTA methods while remaining simple, model-agnostic, and highly label-efficient in low-supervision settings.

Patrick Peixuan Ye, Chen Shani, Ellen Vitercik• 2025

Related benchmarks

TaskDatasetResultRank
Cluster MappingFlickr30k rev. (inductive)
Win-rate60
7
Cluster MappingFlickr30k (inductive)
Win Rate45
7
Bridged ClusteringFlickr30k standard (transductive)
Win-rate56
6
Bridged ClusteringWIT standard (transductive)
Win-rate12
6
Bridged ClusteringWIT reversed input/output mapping (transductive)
Win Rate9
6
Cluster MappingWIT (inductive)
Win Rate12
6
Cluster MappingWIT rev. (inductive)
Win Rate9
6
Bridged ClusteringBIOSCAN standard (transductive)
Win-rate67
5
Bridged ClusteringFlickr30k reversed input/output mapping (transductive)
Win Rate71
5
Cluster MappingBIOSCAN (inductive)
Win Rate67
5
Showing 10 of 16 rows

Other info

Follow for update