SuperMAN: Interpretable and Expressive Networks over Temporally Sparse Heterogeneous Data
About
Real-world temporal data often consists of multiple signal types recorded at irregular, asynchronous intervals. For instance, in the medical domain, different types of blood tests can be measured at different times and frequencies, resulting in fragmented and unevenly scattered temporal data. Similar issues of irregular sampling occur in other domains, such as the monitoring of large systems using event log files. Effectively learning from such data requires handling sets of temporal sparse and heterogeneous signals. In this work, we propose Super Mixing Additive Networks (SuperMAN), a novel and interpretable-by-design framework for learning directly from such heterogeneous signals, by modeling them as sets of implicit graphs. SuperMAN provides diverse interpretability capabilities, including node-level, graph-level, and subset-level importance, and enables practitioners to trade finer-grained interpretability for greater expressivity when domain priors are available. SuperMAN achieves state-of-the-art performance in real-world high-stakes tasks, including predicting Crohn's disease onset and hospital length of stay from routine blood test measurements and detecting fake news. Furthermore, we demonstrate how SuperMAN's interpretability properties assist in revealing disease development phase transitions and provide crucial insights in the healthcare domain.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Crohn's Disease onset prediction | CD 10% random (test) | Mean AUPRC83.93 | 9 | |
| Length of Stay in ICU prediction | P12 (test) | Mean AUPRC97.41 | 9 | |
| Clinical Diagnosis | CD | ECE0.028 | 7 |