MLPlatt: Simple Calibration Framework for Ranking Models

About

Ranking models are extensively used in e-commerce for relevance estimation. These models often suffer from poor interpretability and no scale calibration, particularly when trained with typical ranking loss functions. This paper addresses the problem of post-hoc calibration of ranking models. We introduce MLPlatt: a simple yet effective ranking model calibration method that preserves the item ordering and converts ranker outputs to interpretable click-through rate (CTR) probabilities usable in downstream tasks. The method is context-aware by design and achieves good calibration metrics globally, and within strata corresponding to different values of a selected categorical field (such as user country or device), which is often important from a business perspective of an E-commerce platform. We demonstrate the superiority of MLPlatt over existing approaches on two datasets, achieving an improvement of over 10\% in F-ECE (Field Expected Calibration Error) compared to other methods. Most importantly, we show that high-quality calibration can be achieved without compromising the ranking quality.

Piotr Bajger, Roman Dusek, Krzysztof Galias, Pawe{\l} M{\l}yniec, Aleksander Wawer, Pawe{\l} Zawistowski• 2026

Related benchmarks

Task	Dataset	Result
Calibration and Ranking	Allegro dataset	F-ECE0.0021	5
CTR Prediction	AliExpress	F-ECE0.0028	5
Ranking Calibration	Allegro proprietary (test)	F-ECE0.0021	4

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord