Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exact Stiefel Optimization for Probabilistic PLS: Closed-Form Updates, Error Bounds, and Calibrated Uncertainty

About

Probabilistic partial least squares (PPLS) is a central likelihood-based model for two-view learning when one needs both interpretable latent factors and calibrated uncertainty. Building on the identifiable parameterization of Bouhaddani et al.\ (2018), existing fitting pipelines still face two practical bottlenecks: noise--signal coupling under joint EM/ECM updates and nontrivial handling of orthogonality constraints. Following the fixed-noise scalar-likelihood line of Hu et al.\ (2025), we develop an end-to-end framework that combines noise pre-estimation, constrained likelihood optimization, and prediction calibration in one pipeline. Relative to Hu et al.\ (2025), we replace full-spectrum noise averaging with noise-subspace estimation and replace interior-point penalty handling with exact Stiefel-manifold optimization. The noise-subspace estimator attains a signal-strength-independent leading finite-sample rate and matches a minimax lower bound, while the full-spectrum estimator is shown to be inconsistent under the same model. We further extend the framework to sub-Gaussian settings via optional Gaussianization and provide closed-form standard errors through a block-structured Fisher analysis. Across synthetic high-noise settings and two multi-omics benchmarks (TCGA-BRCA and PBMC CITE-seq), the method achieves near-nominal coverage without post-hoc recalibration, reaches Ridge-level point accuracy on TCGA-BRCA at rank $r=3$, matches or exceeds PO2PLS on cross-view prediction while providing native calibrated uncertainty, and improves stability of parameter recovery.

Haoran Hu, Xingce Wang• 2026

Related benchmarks

TaskDatasetResultRank
Uncertainty CalibrationGaussian synthetic benchmark (5-fold CV)
Empirical Coverage95.1
10
Uncertainty EstimationTCGA-BRCA
MSE0.4498
9
Uncertainty EstimationCITE-seq
MSE0.2586
9
Gene-protein pair detectionTCGA-BRCA
Total Detected Pairs1.27e+5
8
Protein imputationPBMC CITE-seq (3-fold CV)
MSE0.2586
7
Parameter EstimationSynthetic p=q=200, M=20, Low noise
MSE (W)0.01
6
Parameter EstimationSynthetic p=q=200, M=20, High noise
MSE_W0.08
6
Parameter EstimationSynthetic p=q=500, M=10, Low noise
MSE (W)0.01
5
Parameter EstimationSynthetic p=q=500, M=10, High noise
MSE_W0.08
5
Point PredictionTCGA-BRCA (5-fold CV)
MSE0.4498
5
Showing 10 of 11 rows

Other info

Follow for update