Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mean-Shift PCA by Knockoff Mean

About

Removing noise is difficult, but adding noise is easy. In this work, we show how to eliminate mean-shift noisy components from PCA by deliberately introducing knockoff mean-shift perturbation. Standard PCA is highly sensitive to shifts in the sample mean: a small fraction of samples from a shifted distribution can cause large deviations in the leading principal components. In high-dimensional regimes, existing Robust PCA approaches cannot handle the mean-shift contamination structure inherent in the mixture model. Using tools from Random Matrix Theory, we prove that the mean-shift spikes are spectrally separable from the stable eigenvalues of the original covariance. Furthermore, the original eigenspace remains asymptotically invariant to the contamination, independent of the mixture weight. Exploiting this spectral stability, we propose a simple, two-stage PCA algorithm by adding knockoff mean that identifies and removes the mean-shift component using only standard PCA operations.

Mengda Li, Zeng Li, Jianfeng Yao• 2026

Related benchmarks

TaskDatasetResultRank
Principal Component EstimationGaussian setting with mean-shift and covariance-shift contamination synthetic (test)
Largest PC Alignment (%)97.91
112
Principal Component AnalysisSynthetic Gaussian Data
Runtime (ms)15.9
12
Principal Component AnalysisSynthetic Gaussian data d=900, n=1000 (test)
Runtime (ms)14.2
9
Principal Component AlignmentGaussian Mixture ($d=900, n=10^3, \pi_1=5\%$)
Alignment (%)95.85
7
Principal Component AlignmentGaussian Mixture ($d=900, n=10^3, \pi_1=10\%$)
Alignment97.16
7
Principal Component AlignmentGaussian Mixture ($d=900, n=10^3, \pi_1=15\%$)
PCA Alignment97.39
7
Principal Component AlignmentGaussian Mixture ($d=900, n=10^3, \pi_1=20\%)
Alignment (%)96.17
7
Showing 7 of 7 rows

Other info

Follow for update