Learnable Topological Features for Phylogenetic Inference via Graph Neural Networks
About
Structural information of phylogenetic tree topologies plays an important role in phylogenetic inference. However, finding appropriate topological structures for specific phylogenetic inference tasks often requires significant design effort and domain expertise. In this paper, we propose a novel structural representation method for phylogenetic inference based on learnable topological features. By combining the raw node features that minimize the Dirichlet energy with modern graph representation learning techniques, our learnable topological features can provide efficient structural information of phylogenetic trees that automatically adapts to different downstream tasks without requiring domain expertise. We demonstrate the effectiveness and efficiency of our method on a simulated data tree probability estimation task and a benchmark of challenging real data variational Bayesian phylogenetic inference problems.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Marginal log-likelihood estimation | DS1 27 Taxa, 1949 Sites | Marginal Log-Likelihood-7.11e+3 | 30 | |
| Marginal log-likelihood estimation | DS3 36 Taxa, 1812 Sites | MLL-3.37e+4 | 30 | |
| Marginal log-likelihood estimation | DS4 41 Taxa, 1137 Sites | Marginal Log-Likelihood-1.33e+4 | 30 | |
| Marginal log-likelihood estimation | DS2 29 Taxa, 2520 Sites | MLL-2.64e+4 | 30 | |
| Marginal log-likelihood estimation | DS6 (50 Taxa, 1133 Sites) | MLL-6.72e+3 | 30 | |
| Marginal log-likelihood estimation | DS5 50 Taxa, 378 Sites | MLL-8.21e+3 | 30 | |
| Marginal log-likelihood estimation | DS8 64 Taxa, 1008 Sites | Marginal Log-Likelihood-8.65e+3 | 29 | |
| Marginal log-likelihood estimation | DS7 59 Taxa, 1824 Sites | Marginal Log-Likelihood-3.73e+4 | 27 | |
| Marginal log-likelihood estimation | DS1 (test) | MLL-7.11e+3 | 11 | |
| Marginal log-likelihood estimation | DS3 | MLL-3.37e+4 | 11 |