Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Pre-Trained Image Processing Transformer

About

As the computing power of modern hardware is increasing strongly, pre-trained deep learning models (e.g., BERT, GPT-3) learned on large-scale datasets have shown their effectiveness over conventional methods. The big progress is mainly contributed to the representation ability of transformer and its variant architectures. In this paper, we study the low-level computer vision task (e.g., denoising, super-resolution and deraining) and develop a new pre-trained model, namely, image processing transformer (IPT). To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs. The IPT model is trained on these images with multi-heads and multi-tails. In addition, the contrastive learning is introduced for well adapting to different image processing tasks. The pre-trained model can therefore efficiently employed on desired task after fine-tuning. With only one pre-trained model, IPT outperforms the current state-of-the-art methods on various low-level benchmarks. Code is available at https://github.com/huawei-noah/Pretrained-IPT and https://gitee.com/mindspore/mindspore/tree/master/model_zoo/research/cv/IPT

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, Wen Gao• 2020

Related benchmarks

TaskDatasetResultRank
Super-ResolutionSet5
PSNR38.37
751
Super-ResolutionUrban100
PSNR33.76
603
Super-ResolutionSet14
PSNR34.43
586
Image DeblurringGoPro (test)
PSNR32.91
585
Image Super-resolutionSet5 (test)
PSNR38.37
544
Image Super-resolutionSet5
PSNR38.37
507
Single Image Super-ResolutionUrban100
PSNR33.76
500
Super-ResolutionB100
PSNR32.48
418
Super-ResolutionB100 (test)
PSNR32.48
363
Image Super-resolutionSet14
PSNR34.43
329
Showing 10 of 97 rows
...

Other info

Code

Follow for update