Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment

About

Aligning large language models (LLMs) with human preferences in federated learning (FL) is challenging due to decentralized, privacy-sensitive, and highly non-IID preference data. Direct Preference Optimization (DPO) offers an efficient alternative to reinforcement learning with human feedback (RLHF), but its direct application in FL suffers from severe performance degradation under non-IID data and limited generalization of implicit rewards. To bridge this gap, we propose FedPDPO (Federated Personalized Direct Preference Optimization), a personalized federated framework for preference alignment of LLMs. It adopts a parameter-efficient fine-tuning architecture where each client maintains a frozen pretrained LLM backbone augmented with a Low-Rank Adaptation (LoRA) adapter, enabling communication-efficient aggregation. To address non-IID heterogeneity, we devise (1) the globally shared LoRA adapter with the personalized client-specific LLM head. Moreover, we introduce (2) a personalized DPO training strategy with a client-specific explicit reward head to complement implicit rewards and further alleviate non-IID heterogeneity, and (3) a bottleneck adapter to balance global and local features. We provide theoretical analysis establishing the probabilistic foundation and soundness. Extensive experiments on multiple preference datasets demonstrate state-of-the-art performance, achieving up to 4.80% average accuracy improvements in federated intra-domain and cross-domain settings.

Kewen Zhu, Liping Yi, Zhiming Zhao, Zhuang Qi, Han Yu, Qinghua Hu• 2026

Related benchmarks

TaskDatasetResultRank
Direct Preference OptimizationIMDB 3 clients
Accuracy87.41
11
Direct Preference OptimizationIMDB 10 clients
Accuracy85.23
11
Direct Preference OptimizationCode-Vulnerability-Security 3 clients
Accuracy96.92
11
Direct Preference OptimizationCode-Vulnerability-Security 11 clients
Accuracy92.8
11
Direct Preference OptimizationWebGPT
Accuracy58.92
11
Direct Preference OptimizationPyDPO
Accuracy91.47
11
Direct Preference OptimizationUltraFeedback
Accuracy69.92
11
Preference AlignmentWebGPT (test)
Accuracy61.24
11
Preference AlignmentPyDPO (test)
Accuracy94.32
11
Preference AlignmentUltraFeedback (test)
Accuracy74.18
11
Showing 10 of 10 rows

Other info

Follow for update