Do Vision Language Models Understand Human Engagement in Games?

About

Inferring human engagement from gameplay video is important for game design and player-experience research, yet it remains unclear whether vision--language models (VLMs) can infer such latent psychological states from visual cues alone. Using the GameVibe Few-Shot dataset across nine first-person shooter games, we evaluate three VLMs under six prompting strategies, including zero-shot prediction, theory-guided prompts grounded in Flow, GameFlow, Self-Determination Theory, and MDA, and retrieval-augmented prompting. We consider both pointwise engagement prediction and pairwise prediction of engagement change between consecutive windows. Results show that zero-shot VLM predictions are generally weak and often fail to outperform simple per-game majority-class baselines. Memory- or retrieval-augmented prompting improves pointwise prediction in some settings, whereas pairwise prediction remains consistently difficult across strategies. Theory-guided prompting alone does not reliably help and can instead reinforce surface-level shortcuts. These findings suggest a perception--understanding gap in current VLMs: although they can recognize visible gameplay cues, they still struggle to robustly infer human engagement across games.

Ziyi Wang, Qizan Guo, Rishitosh Singh, Xiyang Hu• 2026

Related benchmarks

Task	Dataset	Result
Pairwise engagement prediction	Borderlands 3	Accuracy87.5	15
Pairwise engagement prediction	CS:GO Office	Accuracy76.9	15
Pairwise engagement prediction	Blitz Brigade	Accuracy0.714	15
Pairwise engagement prediction	Corridor 7	Accuracy73	15
Pairwise engagement prediction	Battlefield 42	Accuracy67.2	15
Pairwise engagement prediction	Apex Legends	Accuracy71.4	15
Pairwise engagement prediction	CSGO 19	Accuracy70	15
Pairwise engagement prediction	CSGO 18	Accuracy53.6	15
Pairwise engagement prediction	CS 1.6	Accuracy66.7	15
Pointwise Engagement Prediction	Borderlands 3	Accuracy84.2	15

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord