Tokenised Flow Matching for Hierarchical Simulation Based Inference
About
The cost of simulator evaluations is a key practical bottleneck for Simulation Based Inference (SBI). In hierarchical settings with shared global parameters and exchangeable site-level parameters and observations, this structure can be exploited to improve simulation efficiency. Existing hierarchical SBI approaches factorise the posterior yet still simulate across multiple sites per training sample; We instead explore likelihood factorisation (LF) to train from single-site simulations. In LF sampling we learn a per-site neural surrogate of the simulator and then assemble synthetic multi-site observations to amortise inference for the full hierarchical posterior. Building on this, we propose Tokenised Flow Matching for Posterior Estimation (TFMPE), a tokenised flow matching approach that supports function-valued observations through likelihood factorisation. To enable systematic evaluation, we introduce a benchmark for hierarchical SBI. We validate TFMPE on this benchmark and on realistic infectious disease and computational fluid dynamics models, finding well-calibrated posteriors while reducing computational cost.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Simulation-Based Inference | Hierarchical Gaussian Linear Uniform | l-C2ST3.65e-4 | 55 | |
| Simulation-Based Inference | Hierarchical Two Moons | l-C2ST0.241 | 55 | |
| Simulation-Based Inference | Hierarchical Gaussian Linear | l-C2ST3.43e-4 | 55 | |
| Simulation-Based Inference | Hierarchical Gaussian Mixture | l-C2ST5.92e-4 | 55 | |
| Simulation-Based Inference | Hierarchical SLCP | l-C2ST21.6 | 54 | |
| Simulation-Based Inference | Hierarchical SIR | l-C2ST7.33e-5 | 53 |