Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Game of Thought: Robust Information Seeking with Large Language Models Using Game Theory

About

Large Language Models (LLMs) are increasingly deployed in real-world scenarios where they may lack sufficient information to complete a given task. In such settings, the ability to actively seek out missing information becomes a critical capability. Existing approaches to enhancing this ability often rely on simplifying assumptions that degrade \textit{worst-case} performance. This is an issue with serious implications in high-stakes applications. In this work, we use the game of Twenty Questions to evaluate the information-seeking ability of LLMs. We introduce and formalize its adversarial counterpart, the Strategic Language Search (SLS) problem along with its variants as a two-player zero-sum extensive form game. We propose Game of Thought (GoT), a framework that applies game-theoretic techniques to approximate a Nash equilibrium (NE) strategy for the restricted variant of the game. Empirical results demonstrate that our approach consistently improves worst-case performance compared to (1) direct prompting-based methods and (2) heuristic-guided search methods across all tested settings.

Langyuan Cui, Chun Kai Ling, Hwee Tou Ng• 2026

Related benchmarks

TaskDatasetResultRank
20 Questions20Q Common
Worst Case Interaction Length10
8
20 Questions20Q S128
Worst Case Interaction Length10.8
8
20 Questions20Q Breeds
Worst Case Interaction Length6.6
8
Medical DiagnosisMD DX
Worst Case Interaction Length10.5
8
TroubleshootingTS FloDial
Worst Case Interaction Length7.5
8
Information Seeking20Q Breeds weighted (test)
Worst-case Weighted Payoff32.3
8
Information Seeking20Q Common weighted (test)
Worst-case Weighted Payoff152.1
8
Medical DiagnosisMD DX weighted (test)
Worst-case Weighted Payoff78.3
8
TroubleshootingTS FloDial weighted (test)
Worst-case Weighted Payoff62.3
8
Showing 9 of 9 rows

Other info

Follow for update