Bloom: Designing for LLM-Augmented Behavior Change Interactions
About
Large language models (LLMs) offer novel opportunities to support health behavior change, yet existing work has narrowly focused on text-only interactions. Building on decades of HCI research on effective behavior change interactions, we present Bloom, an application for physical activity promotion that integrates an LLM-based health coaching chatbot with existing design strategies and UI elements. As part of Bloom's development, we conducted a redteaming evaluation and contribute a safety benchmark dataset. In a four-week randomized field study (N=54) comparing Bloom to a no-LLM control, we observed important shifts in psychological outcomes: participants in the LLM condition reported stronger beliefs that activity was beneficial, greater enjoyment, and more self-compassion. Both conditions significantly increased physical activity levels, doubling the proportion of participants meeting recommended weekly guidelines, though descriptively, we observed no advantage for the LLM condition in short-term physical activity levels. Instead, our findings suggest that LLMs may be more effective at shifting mindsets that precede longer-term behavior change.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Safety Classification | Bloom safety filter benchmark 400 examples (val) | Accuracy100 | 7 | |
| Safety Classification | Bloom safety filter benchmark 100 examples (test) | Accuracy100 | 7 | |
| Safety Classification | Bloom safety filter benchmark 100 examples (corrected) (test) | Accuracy100 | 7 |