Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DyCoRM: Dynamic Criterion-Aware Reward Modeling for Text-to-Image Generation

About

With the continued advancement of text-to-image (T2I) generation, producing high-quality images is becoming increasingly attainable; consequently, user demands are shifting toward images that better satisfy their specific requirements. As reward models play an increasingly important role in assessing whether generated images align with user preference, this trend introduces an important challenge for reward modeling: rather than relying solely on static and general evaluation dimensions, reward models should account for the task-relevant and fine-grained criteria through which users assess whether generated images meet their specific requirements. To address this challenge, we propose DyCoRM, a dynamic, criterion-aware reward model that grounds task-relevant criteria and performs criterion-aware preference comparison. To support this setting, we construct DyCoDataset-20K, which provides dynamic criteria together with criterion-level annotations, and further derive DyCoBench-1K, a benchmark for systematically evaluating reward models under dynamic criteria. We further introduce DyCoPick, which applies criterion-aware reward modeling to selecting T2I images. Our contributions establish the first reward modeling framework for dynamic and fine-grained evaluation and practical application in T2I generation.

Jiaying Qian, Ziheng Jia, Qian Zhang, Zicheng Zhang, Jiayi Guo, Junqi Zhang, Guangtao Zhai, Xiongkuo Min• 2026

Related benchmarks

TaskDatasetResultRank
Human preference predictionHPD v2
Accuracy85.1
25
Pairwise Preference PredictionDyCoBench-1K Single Criterion
P(A > B)70.2
17
Pairwise Preference PredictionDyCoBench-1K Multiple Criteria
Preference Rate (A > B)65.6
17
Pairwise Preference PredictionDyCoBench-1K Overall Preference
Preference Rate (A > B)78.3
17
Text-to-Image Preference PredictionPick-a-Pic
Accuracy73.4
17
Text-to-Image Preference PredictionHPD v3
Accuracy77.2
17
Text-to-Image Preference PredictionCross-domain Aggregate
Average Accuracy77
17
Text-to-Image Preference PredictionImageReward
Accuracy67.2
17
Showing 8 of 8 rows

Other info

Follow for update