Generative Auto-Bidding with Unified Modeling and Exploration
About
Automated bidding is central to modern digital advertising. Early rule-based methods lacked adaptability, while subsequent Reinforcement Learning approaches modeled bidding as a Markov Decision Process but struggled with long-term dependencies. Recent generative models show promise, yet they lack explicit mechanisms to balance exploration and safety, relying solely on action perturbations or trajectory guidance without a safety fallback. This results in inefficient exploration and elevated financial risk for advertising platforms. To address this gap, we propose GUIDE (Generative Auto-Bidding with Unified Modeling and Exploration), a framework that synergistically integrates directed exploration with a safe fallback mechanism. GUIDE employs a Decision Transformer (DT) to jointly model historical bidding actions and environmental state transitions. A Q-value module guides the DT's exploration via regularization constraints, while an Inverse Dynamics Module (IDM) leverages DT-predicted future states to infer robust, behaviorally consistent actions as a safe policy fallback. The Q-value module then adaptively selects the final action between these two options, balancing exploration and safety. Together, these components form an integrated "explore-safeguard-select" pipeline that unifies efficiency and safety. We conduct extensive experiments on public datasets, in simulated auction environments, and through large-scale online deployment on Taobao, a leading Chinese advertising platform. Results show GUIDE consistently outperforms state-of-the-art baselines across all scenarios. In real-world deployment, GUIDE achieves notable gains: +4.10% ad GMV, +1.40% ad clicks, +1.66% ad cost, and +3.52% ad ROI, demonstrating its effectiveness and strong industrial applicability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Auto-bidding | TaoBao real-world A/B (test) | ROI3.52 | 10 | |
| Auto-bidding | AuctionNet 50% budget | Score20.3 | 9 | |
| Auto-bidding | AuctionNet (75% budget) | Score29.1 | 9 | |
| Auto-bidding | AuctionNet 100% budget | Score37.6 | 9 | |
| Auto-bidding | AuctionNet 125% budget | Score43.3 | 9 | |
| Auto-bidding | AuctionNet 150% budget | Score48.3 | 9 | |
| Auto-bidding | Simulation environment | Score8.34e+3 | 8 |