A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

About

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm takes an (approximate) SDP solution, constructs one deterministic sparse solution and several randomized solutions, and outputs the best among them. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a technical assumption, which is consistently satisfied in our numerical tests, the average approximation ratio is also bounded by $\mathcal{O}(\log{d})$, where $d$ is the number of features. We show that this technical assumption is satisfied if the SDP solution is low-rank, or has exponentially decaying eigenvalues. We then present two classes of instances for which this technical assumption holds. We also demonstrate that in a covariance model, which generalizes the spiked Wishart model, the deterministic solution in our algorithm achieves a near-optimal approximation ratio. We demonstrate the efficacy of our algorithm through numerical tests on real-world datasets.

Alberto Del Pia, Dekun Zhou• 2025

Related benchmarks

Task	Dataset	Result
Sparse Principal Component Analysis	Eisen 1	Time0.296	12
Sparse Principal Component Analysis	News	Objective Value7.37e+3	6
Sparse Principal Component Analysis	Eisen2	Objective Value27.573	6
Sparse Principal Component Analysis	CovColon	Objective Value1.23e+4	6
Sparse Principal Component Analysis	LymphomaCov	Objective Function Value2.06e+3	6
Sparse Principal Component Analysis	Reddit1500	Objective Value1.20e+3	6
Sparse Principal Component Analysis	Reddit2000	Objective Value2.54e+3	6

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord