SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability
About
We propose a new technique, Singular Vector Canonical Correlation Analysis (SVCCA), a tool for quickly comparing two representations in a way that is both invariant to affine transform (allowing comparison between different layers and networks) and fast to compute (allowing more comparisons to be calculated than with previous methods). We deploy this tool to measure the intrinsic dimensionality of layers, showing in some cases needless over-parameterization; to probe learning dynamics throughout training, finding that networks converge to final representations from the bottom up; to show where class-specific information in networks is formed; and to suggest new training regimes that simultaneously save computation and overfit less. Code: https://github.com/google/svcca/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Prediction-grounded correlation with output difference (JSD) | SST-2 | Spearman Correlation0.66 | 145 | |
| Correlation to Accuracy Difference | Cora | Correlation Coefficient-0.02 | 117 | |
| Prediction-grounded correlation with accuracy difference | ImageNet-100 | Spearman Correlation0.29 | 111 | |
| Correlation to Model Behavior Differences | MNLI | Accuracy Correlation0.32 | 93 | |
| Correlation to Accuracy Difference | Ogbn-arxiv | Correlation Coefficient0.1 | 93 | |
| Correlation to Accuracy Difference | Flickr | Correlation Coefficient0.01 | 92 | |
| Correlation to Accuracy Difference (Test 1) | ImageNet-100 1.0 (test) | JSD Correlation to Accuracy Diff0.25 | 80 | |
| Prediction-grounded correlation with accuracy difference | SST-2 | Spearman Correlation0.4 | 54 | |
| Cross-Lingual Knowledge Alignment | BMLAMA | Pearson Correlation0.88 | 48 | |
| Zero-Shot Cross-Lingual Transfer | XNLI | Pearson Correlation0.9144 | 48 |