MLPlatt: Simple Calibration Framework for Ranking Models
About
Ranking models are extensively used in e-commerce for relevance estimation. These models often suffer from poor interpretability and no scale calibration, particularly when trained with typical ranking loss functions. This paper addresses the problem of post-hoc calibration of ranking models. We introduce MLPlatt: a simple yet effective ranking model calibration method that preserves the item ordering and converts ranker outputs to interpretable click-through rate (CTR) probabilities usable in downstream tasks. The method is context-aware by design and achieves good calibration metrics globally, and within strata corresponding to different values of a selected categorical field (such as user country or device), which is often important from a business perspective of an E-commerce platform. We demonstrate the superiority of MLPlatt over existing approaches on two datasets, achieving an improvement of over 10\% in F-ECE (Field Expected Calibration Error) compared to other methods. Most importantly, we show that high-quality calibration can be achieved without compromising the ranking quality.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Calibration and Ranking | Allegro dataset | F-ECE0.0021 | 5 | |
| CTR Prediction | AliExpress | F-ECE0.0028 | 5 | |
| Ranking Calibration | Allegro proprietary (test) | F-ECE0.0021 | 4 |