Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

REC-RL: Referring expression counting via Gaussian and range-based reward optimization

About

Referring expression counting (REC) is an intention-driven task that requires context-aware visual reasoning. While recent vision-language models incorporate language for visual understanding, most existing REC methods rely on rulebased reinforcement learning with rewards focused primarily on final accuracy, overlooking the quality of intermediate reasoning. We propose REC-RL, a reinforcement learning framework that introduces a think-range-answer paradigm to explicitly optimize the visual reasoning process. RECRL employs Group Relative Policy Optimization and two lightweight rewards: an accuracy reward that combines range-based interval supervision with Gaussian-based precision guidance, and a format reward that enforces structured outputs. By modeling intermediate focus prediction as internal decision-making, REC-RL avoids additional annotations and better aligns with human perception. Extensive experiments demonstrate consistent improvements over strong baselines and robust generalization across benchmarks.

Hui Liu, Yunlai Teng, Kunlong Bai, Pengfei Qi, Haotian Yan, Liang Li, Junlan Feng• 2026

Related benchmarks

TaskDatasetResultRank
Referring Expression CountingREC 8K (val)
MAE5.05
23
Referring Expression CountingREC 8K (test)
MAE5.06
23
Object CountingCross-domain General Object Counting v1 (test)
MAE (Sheep)2.08
11
Referring Expression CountingRefCOCO OOD (test)
MAE0.23
2
Showing 4 of 4 rows

Other info

Follow for update