Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FedSurrogate: Backdoor Defense in Federated Learning via Layer Criticality and Surrogate Replacement

About

Federated Learning remains highly susceptible to backdoor attacks--malicious clients inject targeted behaviours into the global model. Existing defenses suffer from substantial false-positive rates under realistic non-independent and identically distributed (non-IID) data, incorrectly flagging benign clients and degrading model accuracy even when adversaries are correctly identified. We present FedSurrogate, a novel backdoor defense that addresses this limitation by combining bidirectional gradient alignment filtering with layer-adaptive anomaly detection. FedSurrogate performs selective clustering on security-critical layers identified via directional divergence analysis, concentrating the detection signal on a low-dimensional subspace. A bidirectional soft-filtering stage screens trusted clients for residual contamination while rescuing false positives from suspects, substantially reducing misclassifications under heterogeneous conditions. Rather than removing confirmed malicious updates, FedSurrogate replaces them with downscaled surrogate updates from structurally similar benign clients, preserving gradient diversity while neutralising adversarial influence. Extensive evaluations demonstrate that FedSurrogate maintains false-positive rates below 10% across all datasets and attack types, compared to 31-32% for the nearest comparably effective baseline, while achieving superior main-task accuracy and maintaining attack success rates below 2.1% across all tested datasets and attack types under challenging non-IID settings.

Fatima Z. Abacha, Sin G. Teo, Yuanxiang Wu, Lucas C. Cordeiro, Mustafa A. Mustafa• 2026

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10 non-IID alpha=0.5
CBA MTA87.92
9
Image ClassificationCIFAR-100 non-IID alpha=0.5
CBA MTA66.66
9
Backdoor DefenseMNIST alpha=0.5 (non-IID)
CBA MTA99.11
9
Backdoor DefenseFashion-MNIST non-IID alpha=0.5
CBA MTA88.99
9
Federated Image ClassificationCIFAR-10 iid (test)
CBA MTA91.17
9
Federated Image ClassificationCIFAR-100 IID (test)
CBA MTA68.92
9
Malicious Client DetectionMNIST alpha=0.5 (non-IID)
CBA True Positive Rate (TPR)100
8
Malicious Client DetectionFashion-MNIST alpha=0.5 (non-IID)
CBA TPR100
8
Malicious Client DetectionCIFAR-10 alpha=0.5 (Non-IID)
CBA TPR99.8
8
Malicious Client DetectionCIFAR-100 alpha=0.5 (non-IID)
CBA TPR99.3
8
Showing 10 of 10 rows

Other info

Follow for update