SemiReward: A General Reward Model for Semi-supervised Learning

About

Semi-supervised learning (SSL) has witnessed great progress with various improvements in the self-training framework with pseudo labeling. The main challenge is how to distinguish high-quality pseudo labels against the confirmation bias. However, existing pseudo-label selection strategies are limited to pre-defined schemes or complex hand-crafted policies specially designed for classification, failing to achieve high-quality labels, fast convergence, and task versatility simultaneously. To these ends, we propose a Semi-supervised Reward framework (SemiReward) that predicts reward scores to evaluate and filter out high-quality pseudo labels, which is pluggable to mainstream SSL methods in wide task types and scenarios. To mitigate confirmation bias, SemiReward is trained online in two stages with a generator model and subsampling strategy. With classification and regression tasks on 13 standard SSL benchmarks across three modalities, extensive experiments verify that SemiReward achieves significant performance gains and faster convergence speeds upon Pseudo Label, FlexMatch, and Free/SoftMatch. Code and models are available at https://github.com/Westlake-AI/SemiReward.

Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu, Cheng Tan, Stan Z. Li• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	ImageNet (10% labels)	Top-1 Acc74.5	98
Image Classification	CIFAR-100 400 labels	--	82
Image Classification	CIFAR-100 2500 labels	Error Rate9.42	70
Image Classification	CIFAR-100 (10000 labels)	Error Rate8.99	50
Image Classification	STL-10 40 labels	--	42
Regression	RCF-MNIST	RMSE (Avg)61.71	24
Text Classification	AG News 40 labels	Top-1 Error Rate0.1067	19
Text Classification	Yahoo! Answer 500 labels	Top-1 Error Rate0.3092	19
Text Classification	Yahoo! Answer 2000 labels	Top-1 Error Rate (%)29.11	19
Text Classification	Yelp Review 250 labels	Top-1 Error42.68	19

Showing 10 of 33 rows

Other info

Follow for update

@wizwand_team Discord