Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Ranking-aware Reinforcement Learning for Ordinal Ranking

About

Ordinal regression and ranking are challenging due to inherent ordinal dependencies that conventional methods struggle to model. We propose Ranking-Aware Reinforcement Learning (RARL), a novel RL framework that explicitly learns these relationships. At its core, RARL features a unified objective that synergistically integrates regression and Learning-to-Rank (L2R), enabling mutual improvement between the two tasks. This is driven by a ranking-aware verifiable reward that jointly assesses regression precision and ranking accuracy, facilitating direct model updates via policy optimization. To further enhance training, we introduce Response Mutation Operations (RMO), which inject controlled noise to improve exploration and prevent stagnation at saddle points. The effectiveness of RARL is validated through extensive experiments on three distinct benchmarks.

Aiming Hao, Chen Zhu, Jiashu Zhu, Jiahong Wu, Xiangxiang Chu• 2026

Related benchmarks

TaskDatasetResultRank
Aesthetic Quality AssessmentAVA v1 (test)
Kendall's Tau0.883
18
Age EstimationUTKFace--
13
Facial Age RankingUTKFace
Kendall's Tau (2-imgs)0.921
6
Object CountingCOCO-REM 1-img
Accuracy0.718
6
Object Counting RankingCOCO-REM 2-imgs
Kendall's Tau0.893
6
Object Counting RankingCOCO-REM 4-imgs
Kendall's Tau0.871
6
Object Counting RankingCOCO-REM 8-imgs
Kendall's Tau0.823
6
Showing 7 of 7 rows

Other info

Follow for update