Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

About

This study introduces HQ-Edit, a high-quality instruction-based image editing dataset with around 200,000 edits. Unlike prior approaches relying on attribute guidance or human feedback on building datasets, we devise a scalable data collection pipeline leveraging advanced foundation models, namely GPT-4V and DALL-E 3. To ensure its high quality, diverse examples are first collected online, expanded, and then used to create high-quality diptychs featuring input and output images with detailed text prompts, followed by precise alignment ensured through post-processing. In addition, we propose two evaluation metrics, Alignment and Coherence, to quantitatively assess the quality of image edit pairs using GPT-4V. HQ-Edits high-resolution images, rich in detail and accompanied by comprehensive editing prompts, substantially enhance the capabilities of existing image editing models. For example, an HQ-Edit finetuned InstructPix2Pix can attain state-of-the-art image editing performance, even surpassing those models fine-tuned with human-annotated data. The project page is https://thefllood.github.io/HQEdit_web.

Mude Hui, Siwei Yang, Bingchen Zhao, Yichun Shi, Heng Wang, Peng Wang, Yuyin Zhou, Cihang Xie• 2024

Related benchmarks

TaskDatasetResultRank
Instructive image editingEMU Edit (test)
CLIP Image Similarity0.7095
55
Multi-turn image editingMSE-Bench
Success Rate (Turn 1)47.7
26
Object RetextureUHRSD (test)
MSE8.03e+3
14
Image EditingECSSD (test)
MSE7.73e+3
13
Image Editing Quality EvaluationVarious Image Editing Datasets
Instruction Adherence Score2.9
12
Multi-object editingCompBench
LC-T19.163
11
Showing 6 of 6 rows

Other info

Follow for update