Visual Personalization Turing Test

About

We introduce the Visual Personalization Turing Test (VPTT), a new paradigm for evaluating contextual visual personalization based on perceptual indistinguishability, rather than identity replication. A model passes the VPTT if its output (image, video, 3D asset, etc.) is indistinguishable to a human or calibrated VLM judge from content a given person might plausibly create or share. To operationalize VPTT, we present the VPTT Framework, integrating a 10k-persona benchmark (VPTT-Bench), a visual retrieval-augmented generator (VPRAG), and the VPTT Score, a text-only metric calibrated against human and VLM judgments. We show high correlation across human, VLM, and VPTT evaluations, validating the VPTT Score as a reliable perceptual proxy. Experiments demonstrate that VPRAG achieves the best alignment-originality balance, offering a scalable and privacy-safe foundation for personalized generative AI.

Rameen Abdal, James Burgess, Sergey Tulyakov, Kuan-Chieh Jackson Wang• 2026

Related benchmarks

Task	Dataset	Result
Contextual Visual Personalization (Editing)	VPTT-Bench 1.0 (test)	VPTTscore (V)0.626	30
Generation	VPTT-Bench 1.0 (test)	VPTT Score (Novelty Adjusted)0.644	15
Visual Personalization	Visual Personalization Evaluation Set	VIPER Proxy Score (PS)97.4	12
Image Generation and Editing	VPTT Human Study (6000 annotations) (test)	VPTTscore-c (Text) Avg0.464	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord