Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Visual Personalization Turing Test

About

We introduce the Visual Personalization Turing Test (VPTT), a new paradigm for evaluating contextual visual personalization based on perceptual indistinguishability, rather than identity replication. A model passes the VPTT if its output (image, video, 3D asset, etc.) is indistinguishable to a human or calibrated VLM judge from content a given person might plausibly create or share. To operationalize VPTT, we present the VPTT Framework, integrating a 10k-persona benchmark (VPTT-Bench), a visual retrieval-augmented generator (VPRAG), and the VPTT Score, a text-only metric calibrated against human and VLM judgments. We show high correlation across human, VLM, and VPTT evaluations, validating the VPTT Score as a reliable perceptual proxy. Experiments demonstrate that VPRAG achieves the best alignment-originality balance, offering a scalable and privacy-safe foundation for personalized generative AI.

Rameen Abdal, James Burgess, Sergey Tulyakov, Kuan-Chieh Jackson Wang• 2026

Related benchmarks

TaskDatasetResultRank
Contextual Visual Personalization (Editing)VPTT-Bench 1.0 (test)
VPTTscore (V)0.626
30
GenerationVPTT-Bench 1.0 (test)
VPTT Score (Novelty Adjusted)0.644
15
Visual PersonalizationVisual Personalization Evaluation Set
VIPER Proxy Score (PS)97.4
12
Image Generation and EditingVPTT Human Study (6000 annotations) (test)
VPTTscore-c (Text) Avg0.464
4
Showing 4 of 4 rows

Other info

Follow for update