Bloom: Designing for LLM-Augmented Behavior Change Interactions

About

Large language models (LLMs) offer novel opportunities to support health behavior change, yet existing work has narrowly focused on text-only interactions. Building on decades of HCI research on effective behavior change interactions, we present Bloom, an application for physical activity promotion that integrates an LLM-based health coaching chatbot with existing design strategies and UI elements. As part of Bloom's development, we conducted a redteaming evaluation and contribute a safety benchmark dataset. In a four-week randomized field study (N=54) comparing Bloom to a no-LLM control, we observed important shifts in psychological outcomes: participants in the LLM condition reported stronger beliefs that activity was beneficial, greater enjoyment, and more self-compassion. Both conditions significantly increased physical activity levels, doubling the proportion of participants meeting recommended weekly guidelines, though descriptively, we observed no advantage for the LLM condition in short-term physical activity levels. Instead, our findings suggest that LLMs may be more effective at shifting mindsets that precede longer-term behavior change.

Matthew J\"orke, Defne Gen\c{c}, Valentin Teutschbein, Shardul Sapkota, Sarah Chung, Paul Schmiedmayer, Maria Ines Campero, Abby C. King, Emma Brunskill, James A. Landay• 2025

Related benchmarks

Task	Dataset	Result
Safety Classification	Bloom safety filter benchmark 400 examples (val)	Accuracy100	7
Safety Classification	Bloom safety filter benchmark 100 examples (test)	Accuracy100	7
Safety Classification	Bloom safety filter benchmark 100 examples (corrected) (test)	Accuracy100	7

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord