Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

UniMaia: Steering Chess Policies with Language for Human-like Play

About

Recent advances in large language models have enabled natural language to serve as a flexible interface for controlling complex systems, but often at the cost of large-scale multimodal training or weakened domain-specific inductive biases. In structured decision-making domains such as chess, specialized policy networks achieve strong performance but lack semantic controllability, while prompt-conditioned language models are more flexible yet typically exhibit weaker domain grounding. We propose $\textbf{UniMaia}$, a framework for prompt-conditioned policy modulation that adapts a frozen Lc0-based chess policy network using a parameter-efficient text encoder and a ControlNet-style conditioning mechanism. UniMaia enables semantic control over gameplay, including opening selection and player strength, while preserving the pretrained policy representations. We further introduce $\textbf{UniMaia-Aux}$, which incorporates auxiliary temporal conditioning and behavioral prediction objectives. To support this work, we construct a large-scale metadata-augmented Lichess dataset, develop a semi-automated prompt-generation pipeline, and introduce benchmarks spanning both prompt-conditioned and metadata-conditioned settings. UniMaia achieves state-of-the-art expected accuracy on several prompt-conditioned benchmarks and competitive top-move accuracy on general instruction-following tasks, while remaining competitive with dedicated metadata-conditioned approaches on human move prediction benchmarks. UniMaia-Aux further improves expected accuracy and behavioral modeling across several evaluation settings, with modest trade-offs in top-move accuracy. Overall, our results demonstrate that prompt-conditioned control of domain-specific policy networks is feasible without end-to-end multimodal training, while highlighting trade-offs between controllability and predictive performance.

Sherman Siu, Lesley Istead (1, 2) __INSTITUTION_2__ University of Waterloo, (2) Carleton University)• 2026

Related benchmarks

TaskDatasetResultRank
Move predictionLIF
Expected Accuracy55.87
27
Move predictionLOB-C
E[Acc]81.62
27
Next-move predictionM2R
Accuracy@155.36
24
Next-move predictionABB
Acc@155.48
24
Next-move predictionM1-S
Accuracy@156.67
24
Chess Move PredictionABB (first 10 plies omitted)
Acc@155.48
22
Move predictionLOB-P
E[Acc]59.73
19
Move predictionLGB
E[Acc]48.91
19
Chess Move PredictionM2R first 10 plies omitted
Top-1 Accuracy55.36
17
Chess Move PredictionM1-S (first 10 plies omitted)
Top-1 Accuracy56.67
17
Showing 10 of 43 rows

Other info

Follow for update