Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Towards Free Data Selection with General-Purpose Models

About

A desirable data selection algorithm can efficiently choose the most informative samples to maximize the utility of limited annotation budgets. However, current approaches, represented by active learning methods, typically follow a cumbersome pipeline that iterates the time-consuming model training and batch data selection repeatedly. In this paper, we challenge this status quo by designing a distinct data selection pipeline that utilizes existing general-purpose models to select data from various datasets with a single-pass inference without the need for additional training or supervision. A novel free data selection (FreeSel) method is proposed following this new pipeline. Specifically, we define semantic patterns extracted from inter-mediate features of the general-purpose model to capture subtle local information in each image. We then enable the selection of all data samples in a single pass through distance-based sampling at the fine-grained semantic pattern level. FreeSel bypasses the heavy batch selection process, achieving a significant improvement in efficiency and being 530x faster than existing active learning methods. Extensive experiments verify the effectiveness of FreeSel on various computer vision tasks. Our code is available at https://github.com/yichen928/FreeSel.

Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationDomainNet (test)
Average Accuracy77.7
219
Image ClassificationDomainNet
Accuracy (ClipArt)70.1
206
Digit ClassificationDigit-Five (test)
Average Accuracy52.5
60
Video Quality AssessmentYouTube-UGC (test)
SRCC0.814
36
Video Quality AssessmentLIVE-Livestream (test)
SRCC0.627
10
Video Quality AssessmentCGVDS (test)
SRCC0.832
10
Video Quality AssessmentAIGVQA-DB (test)
SRCC0.789
10
Video Quality AssessmentYouTube-SFV SDR (test)
SRCC0.719
10
Video Quality AssessmentYouTube-SFV HDR2SDR (test)
SRCC49.8
10
Video Quality AssessmentYouTube-UGC, CGVDS, LIVE-Livestream, YouTube-SFV SDR, YouTube-SFV HDR2SDR, AIGVQA-DB
SRCC2
10
Showing 10 of 10 rows

Other info

Code

Follow for update