Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Online Partitioned Local Depth for semi-supervised applications

About

We introduce an extension of the partitioned local depth (PaLD) algorithm that is adapted to online applications such as semi-supervised prediction. PaLD is best known for unsupervised, parameter-free clustering, but its robustness is based on triples of data points, making exact analysis computationally expensive. Research is ongoing to improve the scalability of the underlying discrete algorithm and expand the breath of PaLD's applications. The new algorithm we present, online PaLD, is well-suited to situations where it is possible to pre-compute a cohesion network from a reference dataset. After $O(n^3)$ steps to construct a queryable data structure, online PaLD can extend the cohesion network to a new data point in $O(n^2)$ time. Our approach complements previous speed up approaches based on approximation and parallelism. In practical terms, online PaLD makes larger datasets accessible to exact analysis with a relatively simple implementation. We present applications to online anomaly detection and semi-supervised classification for health-care datasets as initial illustrations of online PaLD's potential to expand applications of the PaLD framework.

John D. Foley, Justin T. Lee• 2025

Related benchmarks

TaskDatasetResultRank
Anomaly DetectionWBC
ROCAUC94.8
132
Tabular Anomaly Detectionpima
AUC ROC0.65
86
Tabular Anomaly DetectionVertebral
AUC-ROC62.2
50
Anomaly DetectionCardiotocography
AUC-ROC0.843
44
Anomaly DetectionLympho
AUC-ROC94.2
40
Outlier DetectionBreastW
AUC-PRC93.3
20
Anomaly DetectionHepatitis
AUC ROC0.638
19
Anomaly Detectioncardio
PR0.694
13
Showing 8 of 8 rows

Other info

Follow for update