Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DynaPrompt: Dynamic Test-Time Prompt Tuning

About

Test-time prompt tuning enhances zero-shot generalization of vision-language models but tends to ignore the relatedness among test samples during inference. Online test-time prompt tuning provides a simple way to leverage the information in previous test samples, albeit with the risk of prompt collapse due to error accumulation. To enhance test-time prompt tuning, we propose DynaPrompt, short for dynamic test-time prompt tuning, exploiting relevant data distribution information while reducing error accumulation. Built on an online prompt buffer, DynaPrompt adaptively selects and optimizes the relevant prompts for each test sample during tuning. Specifically, we introduce a dynamic prompt selection strategy based on two metrics: prediction entropy and probability difference. For unseen test data information, we develop dynamic prompt appending, which allows the buffer to append new prompts and delete the inactive ones. By doing so, the prompts are optimized to exploit beneficial information on specific test data, while alleviating error accumulation. Experiments on fourteen datasets demonstrate the effectiveness of dynamic test-time prompt tuning.

Zehao Xiao, Shilin Yan, Jack Hong, Jiayin Cai, Xiaolong Jiang, Yao Hu, Jiayi Shen, Qi Wang, Cees G. M. Snoek• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationDTD
Accuracy48.75
485
Fine-grained visual classificationFGVC-Aircraft (test)
Top-1 Acc24.33
312
Image ClassificationCross-domain Benchmark (AIR, CAL, CAR, DTD, EUR, FLWR, FOOD, PETS, SUN, UCF) (test)
AIR Accuracy24.3
80
Image ClassificationImageNet A, V, R, S (val)
ImageNet Accuracy69.61
38
Image ClassificationImageNet OOD Variants
Accuracy (IN-A)56.2
25
Image ClassificationImageNet and OOD variants 1.0 (test)
ImageNet-A Accuracy60.72
18
Few-shot classificationTV100 60 classes (test)
Accuracy0.026
11
Few-shot classificationGame Characters 65 classes (test)
Accuracy30.4
11
Few-shot classificationLandmarks 35 classes (test)
Accuracy45.7
11
Few-shot classificationIndian Food 80 classes (test)
Accuracy41.6
11
Showing 10 of 14 rows

Other info

Follow for update