Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Prompt Estimation from Prototypes for Federated Prompt Tuning of Vision Transformers

About

Visual Prompt Tuning (VPT) of pre-trained Vision Transformers (ViTs) has proven highly effective as a parameter-efficient fine-tuning technique for adapting large models to downstream tasks with limited data. Its parameter efficiency makes it particularly suitable for Federated Learning (FL), where both communication and computation budgets are often constrained. However, global prompt tuning struggles to generalize across heterogeneous clients, while personalized tuning overfits to local data and lacks generalization. We propose PEP-FedPT (Prompt Estimation from Prototypes for Federated Prompt Tuning), a unified framework designed to achieve both generalization and personalization in federated prompt tuning of ViTs. Within this framework, we introduce the novel Class-Contextualized Mixed Prompt (CCMP) - based on class-specific prompts maintained alongside a globally shared prompt. For each input, CCMP adaptively combines class-specific prompts using weights derived from global class prototypes and client class priors. This approach enables per-sample prompt personalization without storing client-dependent trainable parameters. The prompts are collaboratively optimized via traditional federated averaging technique on the same. Comprehensive evaluations on CIFAR-100, TinyImageNet, DomainNet, and iNaturalist datasets demonstrate that PEP-FedPT consistently surpasses the state-of-the-art baselines under diverse data heterogeneity scenarios, establishing a strong foundation for efficient and generalizable federated prompt tuning of Vision Transformers.

M Yashwanth, Sharannya Ghosh, Aditay Tripathi, Anirban Chakraborty• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationDomainNet (test)
Average Accuracy88.73
266
Image ClassificationDomainNet
Accuracy (ClipArt)95.46
238
Image ClassificationDomainNet
Accuracy90.32
95
Image ClassificationiNaturalist (test)
Accuracy54.16
35
Image ClassificationCIFAR-100 Pathological
Mean Accuracy95.46
26
Image ClassificationiNaturalist (Participating)
Accuracy62.41
8
Image ClassificationTiny-ImageNet (Pathological)
Mean Accuracy91.52
8
Image ClassificationTiny-ImageNet (Dir(0.3))
Mean Accuracy83.44
8
Image ClassificationCIFAR-100 (Participating Clients)
Accuracy95.66
8
Image ClassificationTiny-ImageNet Participating Clients
Accuracy92.53
8
Showing 10 of 12 rows

Other info

Follow for update