Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Codifying Character Logic in Role-Playing

About

This paper introduces Codified Profiles for role-playing, a novel approach that represents character logic as structured, executable functions for behavioral decision-making. Each profile defines a set of functions parse_by_scene(scene) that outputs a list of logic-grounded assertions triggered_statements, using both explicit control structures (e.g., if-then-else) and condition checks like check_condition(scene, question), where each question is a semantically meaningful prompt about the scene (e.g., "Is the character in danger?") discriminated by the role-playing LLM as true, false, or unknown. This explicit representation offers three key advantages over traditional prompt-based profiles, which append character descriptions directly into text prompts: (1) Persistence, by enforcing complete and consistent execution of character logic, rather than relying on the model's implicit reasoning; (2) Updatability, through systematic inspection and revision of behavioral logic, which is difficult to track or debug in prompt-only approaches; (3) Controllable Randomness, by supporting stochastic behavior directly within the logic, enabling fine-grained variability that prompting alone struggles to achieve. To validate these advantages, we introduce a new benchmark constructed from 83 characters and 5,141 scenes curated from Fandom, using NLI-based scoring to compare character responses against ground-truth actions. Our experiments demonstrate the significant benefits of codified profiles in improving persistence, updatability, and behavioral diversity. Notably, by offloading a significant portion of reasoning to preprocessing, codified profiles enable even 1B-parameter models to perform high-quality role-playing, providing a scalable and efficient foundation for local deployment of role-play agents.

Letian Peng, Jingbo Shang• 2025

Related benchmarks

TaskDatasetResultRank
Role-playingRole-playing evaluation (Main characters)
ROUGE-L (Haruhi)82.77
12
Role-playing performance evaluationFandom (test)
Haruhi Adherence Score57.94
8
Role-playing performance evaluationBandori (test)
PoPiPa Score73.02
8
Role-playing performanceRole-playing Artifacts Minor characters 1.0
Score (K-On!)81.54
7
Role-playingRole-playing evaluation (Minor characters)
K-On! ROUGE-L21.21
5
Showing 5 of 5 rows

Other info

Follow for update