Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Active Prompt Learning in Vision Language Models

About

Pre-trained Vision Language Models (VLMs) have demonstrated notable progress in various zero-shot tasks, such as classification and retrieval. Despite their performance, because improving performance on new tasks requires task-specific knowledge, their adaptation is essential. While labels are needed for the adaptation, acquiring them is typically expensive. To overcome this challenge, active learning, a method of achieving a high performance by obtaining labels for a small number of samples from experts, has been studied. Active learning primarily focuses on selecting unlabeled samples for labeling and leveraging them to train models. In this study, we pose the question, "how can the pre-trained VLMs be adapted under the active learning framework?" In response to this inquiry, we observe that (1) simply applying a conventional active learning framework to pre-trained VLMs even may degrade performance compared to random selection because of the class imbalance in labeling candidates, and (2) the knowledge of VLMs can provide hints for achieving the balance before labeling. Based on these observations, we devise a novel active learning framework for VLMs, denoted as PCB. To assess the effectiveness of our approach, we conduct experiments on seven different real-world datasets, and the results demonstrate that PCB surpasses conventional active learning and random sampling methods. Code will be available in https://github.com/kaist-dmlab/pcb .

Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationStanford Cars
Accuracy70.7
635
Image ClassificationEuroSAT
Accuracy81.5
569
Image ClassificationFlowers102
Accuracy96.94
558
Image ClassificationDTD
Accuracy62.33
542
Image ClassificationFood-101
Accuracy70.45
542
Image ClassificationDTD
Accuracy72.6
485
Action RecognitionUCF101
Accuracy75.84
431
Image ClassificationAircraft
Accuracy32.27
333
Image ClassificationOxfordPets
Accuracy83.16
160
Image ClassificationCaltech101
Base Accuracy93.85
129
Showing 10 of 24 rows

Other info

Follow for update