Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Emu Edit: Precise Image Editing via Recognition and Generation Tasks

About

Instruction-based image editing holds immense potential for a variety of applications, as it enables users to perform any editing operation using a natural language instruction. However, current models in this domain often struggle with accurately executing user instructions. We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing. To develop Emu Edit we train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks, all of which are formulated as generative tasks. Additionally, to enhance Emu Edit's multi-task learning abilities, we provide it with learned task embeddings which guide the generation process towards the correct edit type. Both these elements are essential for Emu Edit's outstanding performance. Furthermore, we show that Emu Edit can generalize to new tasks, such as image inpainting, super-resolution, and compositions of editing tasks, with just a few labeled examples. This capability offers a significant advantage in scenarios where high-quality samples are scarce. Lastly, to facilitate a more rigorous and informed assessment of instructable image editing models, we release a new challenging and versatile benchmark that includes seven different image editing tasks.

Shelly Sheynin, Adam Polyak, Uriel Singer, Yuval Kirstain, Amit Zohar, Oron Ashual, Devi Parikh, Yaniv Taigman• 2023

Related benchmarks

TaskDatasetResultRank
Instructive image editingEMU Edit (test)
CLIP Image Similarity0.859
83
Instructive image editingMagicBrush (test)
CLIP Image0.897
53
Controllable Image Generation and EditingCelebA-HQ (test)
Accuracy71
20
Human Image Controllability and EditingAffectHuman-43K (test)
Accuracy72.4
20
Facial Image EditingAffectNet
Accuracy66.8
20
Instructive image editingEMU Edit 1.0 (test)
CLIPim0.859
15
Instruction-based Image EditingEmuEdit-bench (test)
CLIP-src Score0.8854
13
Image EditingEmu Edit
DINO Similarity0.819
12
Instructive image editingMagicBrush 1.0 (test)
CLIP Image Similarity0.897
12
Image EditingMagicBrush v1 (test)
CLIP Input Similarity0.897
7
Showing 10 of 10 rows

Other info

Follow for update