Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation
About
Scientific modeling applications often require estimating a distribution of parameters consistent with a dataset of observations - an inference task also known as source distribution estimation. This problem can be ill-posed, however, since many different source distributions might produce the same distribution of data-consistent simulations. To make a principled choice among many equally valid sources, we propose an approach which targets the maximum entropy distribution, i.e., prioritizes retaining as much uncertainty as possible. Our method is purely sample-based - leveraging the Sliced-Wasserstein distance to measure the discrepancy between the dataset and simulations - and thus suitable for simulators with intractable likelihoods. We benchmark our method on several tasks, and show that it can recover source distributions with substantially higher entropy than recent source estimation methods, without sacrificing the fidelity of the simulations. Finally, to demonstrate the utility of our approach, we infer source distributions for parameters of the Hodgkin-Huxley model from experimental datasets with thousands of single-neuron measurements. In summary, we propose a principled method for inferring source distributions of scientific simulator parameters while retaining as much uncertainty as possible.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Particle physics data unfolding | Leptoquark pT split | 1D Wasserstein Distance7.96 | 11 | |
| Particle physics data unfolding | Leptoquark px | Wasserstein Distance (1D)5.89 | 11 | |
| Particle physics data unfolding | ttbar CT14lo, Vincia (pT split) | 1D Wasserstein Distance15.14 | 11 | |
| Particle physics data unfolding | Z+jets CTEQ6L1 (pT split) | 1-D Wasserstein Distance19.59 | 11 | |
| Particle physics data unfolding | Leptoquark (E) | 1-D Wasserstein Distance57.42 | 11 | |
| Particle physics data unfolding | Z+jets CTEQ6L1 (E) | 1D Wasserstein Distance79.37 | 11 | |
| Particle physics data unfolding | ttbar CT14lo Vincia (px) | 1D Wasserstein Distance11.22 | 11 | |
| Particle physics data unfolding | W+jets CT14lo (px) | 1D Wasserstein Distance11.04 | 11 | |
| Particle physics data unfolding | Z+jets CTEQ6L1 (px) | 1D Wasserstein Distance10.74 | 11 | |
| Particle physics data unfolding | W+jets pT split CT14lo | Wasserstein Distance (1D)25.96 | 11 |