Reparameterizable Subset Sampling via Continuous Relaxations
About
Many machine learning tasks require sampling a subset of items from a collection based on a parameterized distribution. The Gumbel-softmax trick can be used to sample a single item, and allows for low-variance reparameterized gradients with respect to the parameters of the underlying distribution. However, stochastic optimization involving subset sampling is typically not reparameterizable. To overcome this limitation, we define a continuous relaxation of subset sampling that provides reparameterization gradients by generalizing the Gumbel-max trick. We use this approach to sample subsets of features in an instance-wise feature selection task for model interpretability, subsets of neighbors to implement a deep stochastic k-nearest neighbors model, and sub-sequences of neighbors to implement parametric t-SNE by directly comparing the identities of local neighbors. We improve performance in all these tasks by incorporating subset sampling in end-to-end training.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Subset Selection | BeerAdvocate AROMA (test) | Test MSE2.52 | 15 | |
| Learning to Explain | BeerAdvocate AROMA (test) | Test MSE2.52 | 12 | |
| Learning to Explain | BeerAdvocate Appearance (test) | Test MSE2.48 | 3 | |
| Learning to Explain | BeerAdvocate Palate (test) | Test MSE2.94 | 3 | |
| Learning to Explain | BeerAdvocate Taste (test) | Test MSE2.18 | 3 |