FedSDR: Federated Self-Distillation with Rectification

About

Federated fine-tuning of Large Language Models faces severe statistical heterogeneity. However, existing model-level defenses often overlook the root cause: intrinsic data distribution mismatches. In this work, we first establish Federated Self-Distillation (FedSD) as a fundamental and potent strategy. By projecting client representations into a smoothed ``model-understanding space,'' FedSD alone serves as a universal booster, demonstrating superior performance over conventional algorithms. Despite its success, we identify a subtle trade-off termed the Rewrite Paradox -- unconstrained self-distillation can inadvertently increase hallucinations and redundancy. To refine this paradigm, we further propose FedSDR (Federated Self-Distillation with Rectification), the ultimate reinforced framework. It augments FedSD with a dual-stream mechanism: a local LoRA-S (Smoothing) branch to implicitly absorb heterogeneity via distilled data, and a parallel global LoRA-R (Rectification) branch anchored to raw data to enforce factual correctness. By selectively aggregating only LoRA-R, FedSDR yields a globally aligned and faithful model. Extensive experiments verify its superior performance.

Ziheng Ren, Zhanming Shen, Hao Wang, Ning Liu, You Song• 2026

Related benchmarks

Task	Dataset	Result
Knowledge Evaluation	MMLU	MMLU Accuracy47.11	64
Discrete reasoning	DROP	Exact Match (EM)37.94	25
Complex Factual Reasoning	BBH	BBH Complex Factual Reasoning Score35.81	6
Logic-heavy Reasoning	CRASS	Score56.22	6

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord