Optimal whitening and decorrelation
About
Whitening, or sphering, is a common preprocessing step in statistical analysis to transform random variables to orthogonality. However, due to rotational freedom there are infinitely many possible whitening procedures. Consequently, there is a diverse range of sphering methods in use, for example based on principal component analysis (PCA), Cholesky matrix decomposition and zero-phase component analysis (ZCA), among others. Here we provide an overview of the underlying theory and discuss five natural whitening procedures. Subsequently, we demonstrate that investigating the cross-covariance and the cross-correlation matrix between sphered and original variables allows to break the rotational invariance and to identify optimal whitening transformations. As a result we recommend two particular approaches: ZCA-cor whitening to produce sphered variables that are maximally similar to the original variables, and PCA-cor whitening to obtain sphered variables that maximally compress the original variables.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Batch Correction | DEEP 1 | Connectivity Score0.3 | 11 | |
| Batch Correction | JUMP 1 | Conn.0.48 | 11 | |
| Batch Correction | DEEP 3 | Connectivity0.3 | 9 | |
| Batch Correction | DEEP 2 low diversity batch setting | Connectivity0.29 | 9 | |
| Batch Correction | JUMP 2 | Connectivity0.34 | 9 | |
| Batch Correction | JUMP 4 | Connectivity0.58 | 9 | |
| Batch Correction | JUMP 3 1.0 (supplementary) | Connectivity0.74 | 9 | |
| Batch Correction | JUMP large, diverse batch setting 5 | Conn.0.34 | 7 |