MagicQuill: An Intelligent Interactive Image Editing System
About
Image editing involves a variety of complex tasks and requires efficient and precise manipulation techniques. In this paper, we present MagicQuill, an integrated image editing system that enables swift actualization of creative ideas. Our system features a streamlined yet functionally robust interface, allowing for the articulation of editing operations (e.g., inserting elements, erasing objects, altering color) with minimal input. These interactions are monitored by a multimodal large language model (MLLM) to anticipate editing intentions in real time, bypassing the need for explicit prompt entry. Finally, we apply a powerful diffusion prior, enhanced by a carefully learned two-branch plug-in module, to process editing requests with precise control. Experimental results demonstrate the effectiveness of MagicQuill in achieving high-quality image edits. Please visit https://magic-quill.github.io to try out our system.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Editing | MagicBrush Single-Turn | L1 Loss0.033 | 11 | |
| Multimodal Oil Painting Generation | DiffusionDB stylized | Gram Similarity0.585 | 8 | |
| Image Editing | MagicBrush Multi-Turn | L1 Loss0.035 | 7 | |
| Editing Intent Prediction | Scribbles | Intent Prediction Accuracy30.2 | 6 | |
| Conditional Image Editing | Constructed dataset (test) | LPIPS0.0667 | 5 | |
| Image Editing | DiffusionDB User Study (test) | Semantic Alignment Score4.01 | 5 | |
| 3D Texture Editing | 24 3D meshes (test) | CLIP Score28.04 | 5 | |
| Intent Prediction | Painting Assistor Evaluation Set (test) | GPT-4 Similarity2.712 | 4 | |
| Line-guided Region Redrawing | Line-guided Region Redrawing (test) | PSNR17.78 | 4 | |
| Line-guided Region Redrawing | Line-guided Region Redrawing dataset | LPIPS0.1472 | 4 |