Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Kempe Swap K-Means: A Scalable Near-Optimal Solution for Semi-Supervised Clustering

About

This paper presents a novel centroid-based heuristic algorithm, termed Kempe Swap K-Means, for constrained clustering under rigid must-link (ML) and cannot-link (CL) constraints. The algorithm employs a dual-phase iterative process: an assignment step that utilizes Kempe chain swaps to refine current clustering in the constrained solution space and a centroid update step that computes optimal cluster centroids. To enhance global search capabilities and avoid local optima, the framework incorporates controlled perturbations during the update phase. Empirical evaluations demonstrate that the proposed method achieves near-optimal partitions while maintaining high computational efficiency and scalability. The results indicate that Kempe Swap K-Means consistently outperforms state-of-the-art benchmarks in both clustering accuracy and algorithmic efficiency for large-scale datasets.

Yuxuan Ren, Shijie Deng• 2026

Related benchmarks

TaskDatasetResultRank
Constrained ClusteringCOL2
CPU Time (multiples of KSKM)1
130
Constrained ClusteringCOL2
Min Inertia (Relative to KSKM)1
70
ClusteringCOL2
Clustering Inertia (KSKM multiple)0.98
60
Constrained ClusteringCOL2
Success Rate100
60
Constrained ClusteringCOL1
CPU Time (multiples of KSKM)1
52
Constrained ClusteringCOL1
Minimum Clustering Inertia (Relative to KSKM)1
28
ClusteringCIFAR-100--
25
ClusteringCOL1
Clustering Inertia (KSKM Multiple)1
24
ClusteringCOL3
Clustering Inertia (Multiple of KSKM)0.98
24
Constrained ClusteringCOL1
Success Rate100
24
Showing 10 of 31 rows

Other info

Follow for update