Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reducing Bias and Variance: Generative Semantic Guidance and Bi-Layer Ensemble for Image Clustering

About

Image clustering aims to partition unlabeled image datasets into distinct groups. A core aspect of this task is constructing and leveraging prior knowledge to guide the clustering process. Recent approaches introduce semantic descriptions as prior information, most of which typically relying on matching-based techniques with predefined vocabularies. However, the limited matching space restricts their adaptability to downstream clustering tasks. Moreover, these methods primarily focus on reducing bias to improve performance, frequently overlooking the importance of variance reduction. To address these limitations, we propose GSEC (Image Clustering based on Generative Semantic Guidance and Bi-Layer Ensemble), a framework designed to reduce bias through generative semantic guidance and mitigate variance via ensemble learning. Our method employs Multimodal Large Language Models to generate semantic descriptions and derive image embeddings via weighted averaging. Additionally, a bi-layer ensemble strategy integrates cross-modal information through BatchEnsemble in the inner layer and aligns outputs via an alignment mechanism in the outer layer. Comparative experiments demonstrate that GSEC outperforms 18 state-of-the-art methods across six benchmark datasets, while further analysis confirms its effectiveness in simultaneously reducing both bias and variance. The code is available at https://github.com/2017LI/GSEC.git.

Feijiang Li, Zhenxiong Li, Jieting Wang, Zizheng Jiu, Saixiong Liu, Liang Du• 2026

Related benchmarks

TaskDatasetResultRank
Image ClusteringImageNet-10
NMI0.996
220
ClusteringImagenet Dogs
NMI86.3
105
ClusteringSTL-10
ACC99.4
64
ClusteringCIFAR-10
ACC96.3
52
Image ClusteringImageNet-1K
NMI81.3
36
ClusteringCIFAR100
Clustering Accuracy61.8
31
Showing 6 of 6 rows

Other info

Follow for update