Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

About

Many applications, such as autonomous driving, heavily rely on multi-modal data where spatial alignment between the modalities is required. Most multi-modal registration methods struggle computing the spatial correspondence between the images using prevalent cross-modality similarity measures. In this work, we bypass the difficulties of developing cross-modality similarity measures, by training an image-to-image translation network on the two input modalities. This learned translation allows training the registration network using simple and reliable mono-modality metrics. We perform multi-modal registration using two networks - a spatial transformation network and a translation network. We show that by encouraging our translation network to be geometry preserving, we manage to train an accurate spatial transformation network. Compared to state-of-the-art multi-modal methods our presented method is unsupervised, requiring no pairs of aligned modalities for training, and can be adapted to any pair of modalities. We evaluate our method quantitatively and qualitatively on commercial datasets, showing that it performs well on several modalities and achieves accurate alignment.

Moab Arar, Yiftach Ginger, Dov Danon, Ilya Leizerson, Amit Bermano, Daniel Cohen-Or• 2020

Related benchmarks

TaskDatasetResultRank
Geometric CorrespondenceKAIST
RGB-T2.1
10
SegmentationChest MRI to CT
Accuracy90.4
10
SegmentationRetinal OCT
Accuracy84
10
SegmentationCardiac MRI
Accuracy94.8
10
Geometric CorrespondenceNYU Depth V2
RGB-to-D Correspondence Score1.3
10
Geometric CorrespondenceThermal-IM
Cross-modal Score (RGB->T)2.3
10
Image SynthesisRetinal OCT (test)
FID138.4
9
Image SynthesisCardiac MRI (test)
FID235.5
9
Image SynthesisChest MRI to CT (test)
FID124.5
9
Showing 9 of 9 rows

Other info

Follow for update