Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Generative Auto-Bidding with Unified Modeling and Exploration

About

Automated bidding is central to modern digital advertising. Early rule-based methods lacked adaptability, while subsequent Reinforcement Learning approaches modeled bidding as a Markov Decision Process but struggled with long-term dependencies. Recent generative models show promise, yet they lack explicit mechanisms to balance exploration and safety, relying solely on action perturbations or trajectory guidance without a safety fallback. This results in inefficient exploration and elevated financial risk for advertising platforms. To address this gap, we propose GUIDE (Generative Auto-Bidding with Unified Modeling and Exploration), a framework that synergistically integrates directed exploration with a safe fallback mechanism. GUIDE employs a Decision Transformer (DT) to jointly model historical bidding actions and environmental state transitions. A Q-value module guides the DT's exploration via regularization constraints, while an Inverse Dynamics Module (IDM) leverages DT-predicted future states to infer robust, behaviorally consistent actions as a safe policy fallback. The Q-value module then adaptively selects the final action between these two options, balancing exploration and safety. Together, these components form an integrated "explore-safeguard-select" pipeline that unifies efficiency and safety. We conduct extensive experiments on public datasets, in simulated auction environments, and through large-scale online deployment on Taobao, a leading Chinese advertising platform. Results show GUIDE consistently outperforms state-of-the-art baselines across all scenarios. In real-world deployment, GUIDE achieves notable gains: +4.10% ad GMV, +1.40% ad clicks, +1.66% ad cost, and +3.52% ad ROI, demonstrating its effectiveness and strong industrial applicability.

Mingming Zhang, Feiqing Zhuang, Na Li, Shengjie Sun, Xiaowei Chen, Junxiong Zhu, Fei Xiao, Keping Yang, Lixin Zou, Chenliang Li• 2026

Related benchmarks

TaskDatasetResultRank
Auto-biddingTaoBao real-world A/B (test)
ROI3.52
10
Auto-biddingAuctionNet 50% budget
Score20.3
9
Auto-biddingAuctionNet (75% budget)
Score29.1
9
Auto-biddingAuctionNet 100% budget
Score37.6
9
Auto-biddingAuctionNet 125% budget
Score43.3
9
Auto-biddingAuctionNet 150% budget
Score48.3
9
Auto-biddingSimulation environment
Score8.34e+3
8
Showing 7 of 7 rows

Other info

Follow for update