AutoBG: A Board Game Design Assistant with Interactive Ideation, Iterative Rulebook Generation, and Individualized Feedback
About
Designing a board game demands both thinking as a designer and experiencing as a player, while iterating through repeated prototyping and playtesting cycles, making it a cognitively intensive creative task well suited for human-AI collaboration. However, current systems lack end-to-end support to guide designers through the complete workflow from vague early ideation to iterative rulebook revision and audience testing. To this end, we present AutoBG, a board game design assistant built around critic-driven iterative refinement, comprising four specialized modules: BG-Ideator guides designers via multi-turn dialogue to produce structured design drafts; BG-Realizer generates complete rulebooks from drafts and revises them in a closed loop with BG-Critic, which diagnoses design flaws and gates each revision so that only verified improvements are accepted; and BG-Persona simulates individualized feedback from 150 real player profiles. Together, these modules enable designers to go from an initial idea to a polished, audience-tested rulebook within a single integrated workflow. The system is built on 2.2K structured rulebooks and 180K quality-filtered real player reviews, with task-specific training data derived for each module. Experiments on 207 held-out games show that AutoBG substantially outperforms state-of-the-art baselines (e.g., GPT-5.4), generating rulebooks that approach the quality of published games. Furthermore, a user study with 30 participants across diverse experience levels confirms that AutoBG effectively reduces blank-page anxiety, surfaces hidden design flaws, and provides highly rated, practical assistance throughout the creative process.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Comparison | Held-out games (test) | Accuracy96.8 | 6 | |
| Diagnostic | Held-out games (test) | Quality Score6.07 | 6 | |
| Rating | Held-out games (test) | MAE0.49 | 6 | |
| Individualized Feedback Simulation | Individualized Feedback 210 instances, 50 players (test) | MAE1.19 | 5 | |
| Rulebook Revision | Board Game Rulebook 207 games (test) | Improvement %95.9 | 5 | |
| Rulebook Generation | Board Game Rulebook 207 games (test) | Rating6.95 | 5 |