Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Low-Complexity Probing via Finding Subnetworks

About

The dominant approach in probing neural networks for linguistic properties is to train a new shallow multi-layer perceptron (MLP) on top of the model's internal representations. This approach can detect properties encoded in the model, but at the cost of adding new parameters that may learn the task directly. We instead propose a subtractive pruning-based probe, where we find an existing subnetwork that performs the linguistic task of interest. Compared to an MLP, the subnetwork probe achieves both higher accuracy on pre-trained models and lower accuracy on random models, so it is both better at finding properties of interest and worse at learning on its own. Next, by varying the complexity of each probe, we show that subnetwork probing Pareto-dominates MLP probing in that it achieves higher accuracy given any budget of probe complexity. Finally, we analyze the resulting subnetworks across various tasks to locate where each task is encoded, and we find that lower-level tasks are captured in lower layers, reproducing similar findings in past work.

Steven Cao, Victor Sanh, Alexander M. Rush• 2021

Related benchmarks

TaskDatasetResultRank
Circuit DiscoveryIOI
AUC60.5
12
Circuit DiscoveryGreater-than
AUC0.639
12
Circuit DiscoveryDocstring
AUC0.482
12
Circuit DiscoveryInterpBench (test)
p-value (WMW)3.10e-5
10
Circuit DiscoveryInterpBench
Vargha-Delaney A120.887
10
Circuit DiscoveryTracr Proportion
Loss0.525
6
Circuit DiscoveryTracr-Reverse
Loss0.193
6
Circuit DiscoveryDocstring
KL Divergence0.928
6
Circuit DiscoveryGreaterthan
KL Divergence0.806
6
Circuit DiscoveryIOI
KL Div0.823
6
Showing 10 of 11 rows

Other info

Follow for update