Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Beyond Token-level Supervision: Unlocking the Potential of Decoding-based Regression via Reinforcement Learning

About

Decoding-based regression, which reformulates regression as a sequence generation task, has emerged as a promising paradigm of applying large language models for numerical prediction. However, its progress is hindered by the misalignment between discrete token-level objectives (e.g., cross-entropy) and continuous numerical values. Existing approaches relying on token-level constraints often fail to capture the global magnitude of the target value, limiting their precision and generalization. In this paper, we propose to unlock the potential of decoding-based regression via Reinforcement Learning (RL). We formulate the generation process as a Markov Decision Process, utilizing sequence-level rewards to enforce global numerical coherence. Extensive experiments on tabular regression and code metric regression demonstrate that our method (specifically with ReMax and GRPO) consistently outperforms both state-of-the-art token-level baselines and traditional regression heads, showing the superiority of introducing sequence-level signals. Our analysis further reveals that RL significantly enhances sampling efficiency and predictive precision, establishing decoding-based regression as a robust and accurate paradigm for general-purpose numerical prediction.

Ming Chen, Sheng Tang, Rong-Xi Tan, Ziniu Li, Jiacheng Chen, Ke Xue, Chao Qian• 2025

Related benchmarks

TaskDatasetResultRank
RegressionMoleculeNet Lipophilicity (test)
RMSE0.822
37
Molecular property predictionFreeSolv MoleculeNet
RMSE2.561
17
Molecular property predictionMoleculeNet ESOL (test)
RMSE0.878
12
Tabular RegressionTALENT 100 regression tasks
RMSE (Mean)0.5151
8
Code Latency PredictionKBSS
Spearman's Rho0.6
7
Code Peak-Memory PredictionAPPS
Correlation (rho)0.92
7
Code metric regressionAPPS Leetcode (test)
RMSE0.474
6
Code metric regressionTriton Kernel Latency (test)
RMSE1.094
6
Showing 8 of 8 rows

Other info

Follow for update