Harmonic Networks: Deep Translation and Rotation Equivariance

About

Translating or rotating an input image should not affect the results of many computer vision tasks. Convolutional neural networks (CNNs) are already translation equivariant: input image translations produce proportionate feature map translations. This is not the case for rotations. Global rotation equivariance is typically sought through data augmentation, but patch-wise equivariance is more difficult. We present Harmonic Networks or H-Nets, a CNN exhibiting equivariance to patch-wise translation and 360-rotation. We achieve this by replacing regular CNN filters with circular harmonics, returning a maximal response and orientation for every receptive field patch. H-Nets use a rich, parameter-efficient and low computational complexity representation, and we show that deep feature maps within the network encode complicated rotational invariants. We demonstrate that our layers are general enough to be used in conjunction with the latest architectures and techniques, such as deep supervision and batch normalization. We also achieve state-of-the-art classification on rotated-MNIST, and competitive results on other benchmark challenges.

Daniel E. Worrall, Stephan J. Garbin, Daniyar Turmukhambetov, Gabriel J. Brostow• 2016

Related benchmarks

Task	Dataset	Result
Image Super-resolution	Set5 (test)	PSNR38.089	626
Super-Resolution	B100 (test)	PSNR32.26	408
Image Super-resolution	Set14 (test)	PSNR33.763	348
Single Image Super-Resolution	Urban100 (test)	PSNR32.492	341
Image Denoising	Urban100	PSNR30.787	317
Super-Resolution	Set14 (test)	PSNR33.53	254
Super-Resolution	Urban100 (test)	PSNR31.99	220
Super-Resolution	Set5 (test)	PSNR37.92	192
Image Classification	MNIST rotated (test)	Test Error (%)1.69	105
Super-Resolution	BSDS100 (test)	PSNR32.16	101

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord