Conditional anomaly detection with soft harmonic functions
About
In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Conditional Anomaly Detection | Housing (UCI ML) | Mean Anomaly Agreement Score (AUC)71.3 | 15 | |
| Conditional Anomaly Detection | Auto MPG UCI ML (2/3, 1/3) train-test split | Mean Anomaly Agreement Score (AUC)72.6 | 10 | |
| Conditional Anomaly Detection | Wine Quality (UCI ML) 2/3, 1/3 (train-test) | Mean Anomaly Agreement Score (AUC)74.5 | 10 | |
| Conditional Anomaly Detection | Synthetic Dataset D1 | Mean Anomaly Agreement Score81.3 | 5 | |
| Conditional Anomaly Detection | Synthetic Dataset D2 | Mean Anomaly Agreement Score82.4 | 5 | |
| Conditional Anomaly Detection | Synthetic Dataset D3 | Mean Anomaly Agreement Score63 | 5 |