Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unified Approach for Weakly Supervised Multicalibration

About

Multicalibration requires predicted scores to agree with label probabilities across rich families of subgroups and score-dependent tests, but existing methods require clean input-label pairs for evaluation and post-processing. This assumption fails in weakly supervised learning (WSL) regimes -- including positive-unlabeled, unlabeled-unlabeled, and positive-confidence learning -- where clean labels are costly or unavailable even though reliable uncertainty estimates may be crucial. We address this gap by developing estimators of multicalibration error and post-hoc correction methods for WSL settings in which clean input-label pairs are unavailable. We propose a unified framework for estimating and correcting multicalibration under weak supervision by combining contamination-matrix risk rewrites with witness-based calibration constraints, yielding corrected multicalibration moments with finite-sample guarantees. We further propose weak-label multicalibration boost (WLMC), a generic post-hoc recalibration algorithm under weak supervision. Finally, we conduct experiments across multiple weak-supervision settings to evaluate multicalibration behavior and offer empirical insight into uncertainty estimation under weak supervision.

Futoshi Futami, Takashi Ishida• 2026

Related benchmarks

TaskDatasetResultRank
Classification CalibrationMEPS (test)
Oracle ECE1.74
150
Income PredictionACSIncome (test)
Oracle ECE0.92
150
ClassificationCreditDefault (test)
Oracle ECE1.26
125
CalibrationHMDA (test)
Oracle ECE1.48
100
Image ClassificationCelebA (test)
Accuracy92.11
82
ClassificationHMDA (test)
Oracle ECE2.97
50
Tabular ClassificationCreditDefault (test)
Oracle ECE4.77
25
CalibrationCelebA ImageResNet (test)
ECE (Oracle Estimate)1.01
20
Toxicity DetectionCivilComments BERT (test)
Oracle ECE1.27
20
CalibrationCivilComments BERT (test)
ECE (Oracle Estimate)1.35
5
Showing 10 of 10 rows

Other info

Follow for update