Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Towards In-Context Tone Style Transfer with A Large-Scale Triplet Dataset

About

Tone style transfer for photo retouching aims to adapt the stylistic tone of the reference image to a given content image. However, the lack of high-quality large-scale triplet datasets with stylized ground truth forces existing methods to rely on self-supervised or proxy objectives, which limits model capability. To mitigate this gap, we design a data construction pipeline to build TST100K, a large-scale dataset of 100,000 content-reference-stylized triplets. At the core of this pipeline, we train a tone style scorer to ensure strict stylistic consistency for each triplet. In addition, existing methods typically extract content and reference features independently and then fuse them in a decoder, which may cause semantic loss and lead to inappropriate color transfer and degraded visual aesthetics. Instead, we propose ICTone, a diffusion-based framework that performs tone transfer in an in-context manner by jointly conditioning on both images, leveraging the semantic priors of generative models for semantic-aware transfer. Reward feedback learning using the tone style scorer is further incorporated to improve stylistic fidelity and visual quality. Experiments demonstrate the effectiveness of TST100K, and ICTone achieves state-of-the-art performance on both quantitative metrics and human evaluations.

Yuhai Deng, Huimin She, Wei Shen, Meng Li, Ruoxi Wu, Lunxi Yuan, Xiang Li• 2026

Related benchmarks

TaskDatasetResultRank
Tone Style TransferPST50
CP0.7902
15
Style TransferPST50
CP Count73.23
15
Style TransferTST2K
CP Count72.85
15
Tone Style TransferTST2K
PSNR25.83
14
Tone Style TransferTST2K
Average Ranking1
7
Showing 5 of 5 rows

Other info

Follow for update