Diffusion-Inspired Reconfiguration of Transformers for Uncertainty Calibration
About
Uncertainty calibration in pre-trained transformers is critical for their reliable deployment in risk-sensitive applications. Yet, most existing pre-trained transformers do not have a principled mechanism for uncertainty propagation through their feature transformation stack. In this work, we propose a diffusion-inspired reconfiguration of transformers in which each feature transformation block is modeled as a probabilistic mapping. Composing these probabilistic mappings reveals a probability path that mimics the structure of a diffusion process, transporting data mass from the input distribution to the pre-trained feature distribution. This probability path can then be recompiled on a diffusion process with a unified transition model to enable principled propagation of representation uncertainty throughout the pre-trained model's architecture while maintaining its original predictive performance. Empirical results across a variety of vision and language benchmarks demonstrate that our method achieves superior calibration and predictive accuracy compared to existing uncertainty-aware transformers.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Out-of-Distribution Detection | CIFAR-100 | AUROC79.43 | 107 | |
| Out-of-Distribution Detection | SVHN | AUROC92.14 | 62 | |
| Uncertainty Calibration | CIFAR-10-C | -- | 35 | |
| Out-of-Distribution Detection | LSUN | AUROC0.9139 | 26 | |
| Classification | CIFAR-10 | Acc89.73 | 8 | |
| Classification | IMDB | Accuracy87.88 | 8 | |
| Image Classification | CIFAR-10 (test) | Accuracy87.06 | 8 | |
| Text Classification | IMDB (test) | Accuracy0.8713 | 8 | |
| Text Classification | CoLA (test) | MCC32.05 | 8 | |
| Image Classification | CIFAR-10 (test) | Accuracy87.06 | 6 |