Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MatteFormer: Transformer-Based Image Matting via Prior-Tokens

About

In this paper, we propose a transformer-based image matting model called MatteFormer, which takes full advantage of trimap information in the transformer block. Our method first introduces a prior-token which is a global representation of each trimap region (e.g. foreground, background and unknown). These prior-tokens are used as global priors and participate in the self-attention mechanism of each block. Each stage of the encoder is composed of PAST (Prior-Attentive Swin Transformer) block, which is based on the Swin Transformer block, but differs in a couple of aspects: 1) It has PA-WSA (Prior-Attentive Window Self-Attention) layer, performing self-attention not only with spatial-tokens but also with prior-tokens. 2) It has prior-memory which saves prior-tokens accumulatively from the previous blocks and transfers them to the next block. We evaluate our MatteFormer on the commonly used image matting datasets: Composition-1k and Distinctions-646. Experiment results show that our proposed method achieves state-of-the-art performance with a large margin. Our codes are available at https://github.com/webtoon/matteformer.

GyuTae Park, SungJoon Son, JaeYoung Yoo, SeHo Kim, Nojun Kwak• 2022

Related benchmarks

TaskDatasetResultRank
Image MattingComposition-1K (test)
SAD23.8
203
MattingDistinction-646 (test)
SAD23.6
45
Natural Image MattingDistinctions-646 (test)
SAD23.9
21
Portrait MattingPPM-100 (test)
MSE0.0092
19
Semantic Image MattingSemantic Image Matting Dataset (test)
SAD29.66
16
Image MattingAIM-500
SAD26.87
14
Image MattingAdobe Composition-1K
SAD23.8
12
Image MattingDistinctions-646
SAD23.6
10
Image MattingSemantic Image Matting
SAD23.9
8
Interactive MattingHIM-100K (test)
MSE0.0039
8
Showing 10 of 13 rows

Other info

Follow for update