Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Robust Adversarial Quantification via Conflict-Aware Evidential Deep Learning

About

Reliability of deep learning models is critical for deployment in high-stakes applications, where out-of-distribution or adversarial inputs may lead to detrimental outcomes. Evidential Deep Learning, an efficient paradigm for uncertainty quantification, models predictions as Dirichlet distributions of a single forward pass. However, EDL is particularly vulnerable to adversarially perturbed inputs, making overconfident errors. Conflict-aware Evidential Deep Learning (C-EDL) is a lightweight post-hoc uncertainty quantification approach that mitigates these issues, enhancing adversarial and OOD robustness without retraining. C-EDL generates diverse, task-preserving transformations per input and quantifies representational disagreement to calibrate uncertainty estimates when needed. C-EDL's conflict-aware prediction adjustment improves detection of OOD and adversarial inputs, maintaining high in-distribution accuracy and low computational overhead. Our experimental evaluation shows that C-EDL significantly outperforms state-of-the-art EDL variants and competitive baselines, achieving substantial reductions in coverage for OOD data (up to $\approx$55%) and adversarial data (up to $\approx$90%), across a range of datasets, attack types, and uncertainty metrics.

Charmaine Barker, Daniel Bethell, Simos Gerasimou• 2025

Related benchmarks

TaskDatasetResultRank
Image Classification, OOD Detection, and Adversarial Attack DetectionMNIST (ID) -> FashionMNIST (OOD) (test)
ID Accuracy (%)99.98
11
Image Classification, OOD Detection, and Adversarial Attack DetectionCIFAR10 (ID) -> SVHN (OOD) (test)
ID Accuracy (%)98.4
11
Image Classification, OOD Detection, and Adversarial Attack DetectionCIFAR10 (ID) -> CIFAR100 (Near-OOD) (test)
ID Accuracy98.64
11
Image Classification, OOD Detection, and Adversarial Attack DetectionOxford Flowers low-shot (ID) -> Deep Weeds (OOD) (test)
ID Accuracy (%)100
11
Image Classification, OOD Detection, and Adversarial Attack DetectionMNIST (ID) -> KMNIST (OOD) (test)
ID Accuracy99.98
11
Image Classification, OOD Detection, and Adversarial Attack DetectionMNIST (ID) -> EMNIST (Near-OOD) (test)
ID Accuracy99.99
11
Image ClassificationMNIST
ID Accuracy99.98
9
Adversarial Attack DetectionMNIST L2PGD attack
Adversarial Coverage23.39
9
Out-of-Distribution DetectionMNIST to FashionMNIST
OOD Coverage5.8
9
Showing 9 of 9 rows

Other info

Follow for update