TANGO: Clustering with Typicality-Aware Nonlocal Mode-Seeking and Graph-Cut Optimization
About
Density-based mode-seeking methods generate a \emph{density-ascending dependency} from low-density points towards higher-density neighbors. Current mode-seeking methods identify modes by breaking some dependency connections, but relying heavily on local data characteristics, requiring case-by-case threshold settings or human intervention to be effective for different datasets. To address this issue, we introduce a novel concept called \emph{typicality}, by exploring the \emph{locally defined} dependency from a \emph{global} perspective, to quantify how confident a point would be a mode. We devise an algorithm that effectively and efficiently identifies modes with the help of the global-view typicality. To implement and validate our idea, we design a clustering method called TANGO, which not only leverages typicality to detect modes, but also utilizes graph-cut with an improved \emph{path-based similarity} to aggregate data into the final clusters. Moreover, this paper also provides some theoretical analysis on the proposed algorithm. Experimental results on several synthetic and extensive real-world datasets demonstrate the effectiveness and superiority of TANGO. The code is available at https://github.com/SWJTU-ML/TANGO_code.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unsupervised Clustering | BreastMNIST | Accuracy58 | 25 | |
| Unsupervised Clustering | PneumoniaMNIST | Accuracy90 | 25 | |
| Unsupervised Clustering | NIFD (unsupervised) | Accuracy54.8 | 25 | |
| Unsupervised Clustering | ADNI CN vs AD | Accuracy51.6 | 25 | |
| Clustering | liver ultrasound dataset | Accuracy69.4 | 24 |