Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Batch Bayesian Active Learning with Partial Batch Label Sampling

About

Over the past couple of decades, many active learning acquisition functions have been proposed, leaving practitioners with an unclear choice of which to use. Bayesian-based active learning offers principled objectives with explainable intuition, including Expected Error Reduction (EER), Expected Predictive Information Gain (EPIG), and Bayesian Active Learning by Disagreements (BALD). A key challenge of such methods is the difficult scaling to large batch sizes, leading to either computational challenges (BatchBALD) or dramatic performance drops (top-$B$ selection). Here, using a particular formulation of Bayesian Decision Theory, we derive Partial Batch Label Sampling (ParBaLS) for the EPIG algorithm. We show experimentally for several datasets that ParBaLS EPIG gives superior performance for a fixed budget and Bayesian Logistic Regression on embeddings from large pre-trained models. Our code is available at https://github.com/ADDAPT-ML/ParBaLS.

Kangping Hu, Stephen Mussmann• 2025

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-10 (test)
Accuracy95.4
882
ClassificationCIFAR10 (test)
Accuracy85.52
331
Text ClassificationAG News (test)
Accuracy88
293
Text ClassificationYelp (test)
Accuracy80.89
100
Image ClassificationfMoW (test)
Top-1 Accuracy98.41
60
ClassificationCivilComments (test)
Average Accuracy85.67
51
News ClassificationAG News (test)
Accuracy84.11
48
ClassificationAirline Passenger Satisfaction (test)
Accuracy89.38
45
Image ClassificationiWildCam (test)
Accuracy89.77
45
ClassificationCredit Card Fraud (test)
Accuracy93.46
45
Showing 10 of 20 rows

Other info

Follow for update