HDTree: Generative Modeling of Cellular Hierarchies for Robust Lineage Inference
About
In single-cell research, tracing and analyzing high-throughput single-cell differentiation trajectories is crucial for understanding biological processes. Key to this is the robust modeling of hierarchical structures that govern cellular development. Traditional methods face limitations in computational cost, performance, and stability. VAE-based approaches have made strides but still require branch-specific network modules, limiting their scalability and stability, while often suffering from posterior collapse. To overcome these challenges, we introduce HDTree, a generative modeling framework designed for robust lineage inference. HDTree captures tree relationships within a hierarchical latent space using a unified hierarchical codebook and employs a quantized diffusion process to model continuous cell state transitions. By aligning the generative process with the Waddington landscape, this method not only improves stability and scalability but also enhances the biological plausibility of inferred lineages. HDTree's effectiveness is demonstrated through comparisons on both general-purpose and single-cell datasets, where it outperforms existing methods in lineage inference accuracy, reconstruction quality, and hierarchical consistency. These contributions enable accurate and efficient modeling of cellular differentiation paths, offering reliable insights for biological discovery.\footnote{Code is available at https://github.com/zangzelin/code\_HDTree\_icml.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cellular Lineage Inference | Limb (cell lineage) | DP41 | 14 | |
| Cellular Lineage Inference | LHCO cell lineage | DP42.7 | 12 | |
| Cellular Lineage Inference | Weinreb (cell lineage) | DP63.3 | 12 | |
| Hierarchical Generative Modeling | MNIST | DP92.7 | 7 | |
| Hierarchical Generative Modeling | Fashion MNIST | DP57.4 | 7 | |
| Hierarchical Generative Modeling | 20 Newsgroups text | DP23.7 | 7 | |
| Hierarchical Generative Modeling | Cifar10 32x32 (50k samples) | DP44.2 | 6 | |
| Cellular Lineage Inference | ECL cell lineage 838k points celltype:10 | DP69 | 5 | |
| Lineage Inference | LineageVAE Day 2 | Ratio of Observed Time Points23.2 | 5 | |
| Lineage Inference | LineageVAE Day 4 | Ratio of Observed Time Points38.4 | 5 |