Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Curvature-Aware PCA with Geodesic Tangent Space Aggregation for Semi-Supervised Learning

About

Principal Component Analysis (PCA) is a fundamental tool for representation learning, but its global linear formulation fails to capture the structure of data supported on curved manifolds. In contrast, manifold learning methods model nonlinearity but often sacrifice the spectral structure and stability of PCA. We propose \emph{Geodesic Tangent Space Aggregation PCA (GTSA-PCA)}, a geometric extension of PCA that integrates curvature awareness and geodesic consistency within a unified spectral framework. Our approach replaces the global covariance operator with curvature-weighted local covariance operators defined over a $k$-nearest neighbor graph, yielding local tangent subspaces that adapt to the manifold while suppressing high-curvature distortions. We then introduce a geodesic alignment operator that combines intrinsic graph distances with subspace affinities to globally synchronize these local representations. The resulting operator admits a spectral decomposition whose leading components define a geometry-aware embedding. We further incorporate semi-supervised information to guide the alignment, improving discriminative structure with minimal supervision. Experiments on real datasets show consistent improvements over PCA, Kernel PCA, Supervised PCA and strong graph-based baselines such as UMAP, particularly in small sample size and high-curvature regimes. Our results position GTSA-PCA as a principled bridge between statistical and geometric approaches to dimensionality reduction.

Alexandre L. M. Levada• 2026

Related benchmarks

TaskDatasetResultRank
ClusteringFashion MNIST--
107
ClusteringBreast
ARI0.7138
28
ClusteringSEMEION
ARI16.67
19
ClusteringMFeat Karhunen
ARI11.2
10
ClusteringEngine1
ARI26.58
7
Clusteringheart-h
ARI0.3168
7
ClusteringAP_Colon_Lung
ARI0.6476
7
ClusteringAP_Lung_Kidney
ARI0.4912
7
Clusteringionosphere
Adjusted Rand Index (ARI)24.81
7
ClusteringUser Knowledge
ARI0.1968
6
Showing 10 of 55 rows

Other info

Follow for update