Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation

About

Image restoration (IR) in real-world scenarios presents significant challenges due to the lack of high-capacity models and comprehensive datasets. To tackle these issues, we present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model. GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets, which typically comprise only a few thousand images and thus offer limited generalizability for larger models. GenIR streamlines the process into three stages: image-text pair construction, dual-prompt based fine-tuning, and data generation & filtering. This approach circumvents the laborious data crawling process, ensuring copyright compliance and providing a cost-effective, privacy-safe solution for IR dataset construction. The result is a large-scale dataset of one million high-quality images. Our second contribution, DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve photorealistic restoration. To boost the model's adaptability to diverse real-world degradations, we introduce the Mixture of Adaptive Modulator (MoAM). It employs token-wise degradation priors to dynamically integrate various restoration experts, thereby expanding the range of degradations the model can address. Our exhaustive experiments confirm DreamClear's superior performance, underlining the efficacy of our dual strategy for real-world image restoration. Code and pre-trained models are available at: https://github.com/shallowdream204/DreamClear.

Yuang Ai, Xiaoqiang Zhou, Huaibo Huang, Xiaotian Han, Zhengyu Chen, Quanzeng You, Hongxia Yang• 2024

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)--
2454
Instance SegmentationCOCO 2017 (val)
APm0.167
1144
Semantic segmentationADE20K
mIoU31.9
936
Super-ResolutionImageNet (test)
LPIPS0.2463
32
Image RestorationDRealSR (test)
MUSIQ59.83
27
Real-world Image Super-ResolutionRealLR200
MUSIQ65.926
26
Real-world Image Super-ResolutionRealLQ250
MUSIQ0.6669
26
Real-world Image Super-ResolutionDRealSR
LPIPS0.354
23
Real-world Image Super-ResolutionRealSR
LPIPS0.325
23
Image RestorationDIV2K (val)
PSNR18.69
17
Showing 10 of 24 rows

Other info

Code

Follow for update