| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Marginal Log-Likelihood Estimation | DS2 29 Taxa, 2520 Sites | MLL-26,366.45 | 30 | |
| Smart contract vulnerability detection | DS2 | Detected Defects Count146 | 12 | |
| Marginal Log-Likelihood Estimation | DS2 (test) | MLL-26,367.57 | 11 | |
| stroke-level sketch edit | DS2 QuickDraw (test) | Reconstruction89.25 | 10 | |
| Sketch Reconstruction | DS2 | Reconstruction Score90.66 | 10 | |
| Variational Inference | DS2 | ELBO (nats)-26,569.5 | 9 | |
| Stationary Linear Regression | DS2 1.0 (test) | R20.9842 | 9 | |
| Regression | DS2 | R-Squared0.9842 | 9 | |
| Classification | DS2 (test) | Accuracy85.03 | 8 | |
| Online Learning | DS2 | Avg Wall-Clock Time (s)0.0222 | 8 | |
| Marginal log-likelihood estimation | DS2 29 taxa, 2520 sites 1.0 (test) | Mean Log-Likelihood-26,363.85 | 6 | |
| Marginal Log-Likelihood Estimation | DS2 | Gap (nats)-0.76 | 5 | |
| Phylogenetic Marginal Log-likelihood Estimation | DS2 1 (test) | MLL Gap (nats)-0.76 | 4 | |
| Phylogenetic tree topology density estimation | DS2 | KL Divergence0.0097 | 4 | |
| Multi-robot Task Scheduling | DS2 | Average Schedule Timespan66 | 3 | |
| Variational Bayesian Phylogenetic Inference | DS2 29 taxa, 2520 sites (ground truth) | ML Score-26,367.71 | 3 |