Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Prior shift estimation for positive unlabeled data through the lens of kernel embedding

About

We study estimation of a class prior for unlabeled target samples which possibly differs from that of source population. Moreover, it is assumed that the source data is partially observable: only samples from the positive class and from the whole population are available (PU learning scenario). We introduce a novel direct estimator of a class prior which avoids estimation of posterior probabilities in both populations and has a simple geometric interpretation. It is based on a distribution matching technique together with kernel embedding in a Reproducing Kernel Hilbert Space and is obtained as an explicit solution to an optimisation task. We establish its asymptotic consistency as well as an explicit non-asymptotic bound on its deviation from the unknown prior, which is calculable in practice. We study finite sample behaviour for synthetic and real data and show that the proposal works consistently on par or better than its competitors.

Jan Mielniczuk, Wojciech Rejchel, Pawe{\l} Teisseyre• 2025

Related benchmarks

TaskDatasetResultRank
Prior EstimationCIFAR
Estimation Error0.016
72
Prior EstimationFashion
Estimation Error1.2
72
Prior EstimationMNIST
Estimation Error2.4
72
Class Prior EstimationDiabetes
Estimation Error0.043
36
Class Prior EstimationSpambase
Estimation Error0.014
36
Class Prior EstimationWaveform
Estimation Error1.6
36
Class Prior EstimationYeast
Estimation Error4.3
36
Class Prior Estimationvehicle
Estimation Error2.8
36
Class Prior Estimationbanknote
Estimation Error1.9
36
Class Prior Estimationsegment
Estimation Error0.017
36
Showing 10 of 10 rows

Other info

Follow for update