KernelWarehouse: Rethinking the Design of Dynamic Convolution

About

Dynamic convolution learns a linear mixture of n static kernels weighted with their input-dependent attentions, demonstrating superior performance than normal convolution. However, it increases the number of convolutional parameters by n times, and thus is not parameter efficient. This leads to no research progress that can allow researchers to explore the setting n>100 (an order of magnitude larger than the typical setting n<10) for pushing forward the performance boundary of dynamic convolution while enjoying parameter efficiency. To fill this gap, in this paper, we propose KernelWarehouse, a more general form of dynamic convolution, which redefines the basic concepts of ``kernels", ``assembling kernels" and ``attention function" through the lens of exploiting convolutional parameter dependencies within the same layer and across neighboring layers of a ConvNet. We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures. Intriguingly, KernelWarehouse is also applicable to Vision Transformers, and it can even reduce the model size of a backbone while improving the model accuracy. For instance, KernelWarehouse (n=4) achieves 5.61%|3.90%|4.38% absolute top-1 accuracy gain on the ResNet18|MobileNetV2|DeiT-Tiny backbone, and KernelWarehouse (n=1/4) with 65.10% model size reduction still achieves 2.29% gain on the ResNet18 backbone. The code and models are available at https://github.com/OSVAI/KernelWarehouse.

Chao Li, Anbang Yao• 2024

Related benchmarks

Task	Dataset	Result
Semantic segmentation	ADE20K (val)	--	3069
Object Detection	COCO 2017 (val)	--	2843
Instance Segmentation	COCO 2017 (val)	--	1275
Instance Segmentation	COCO	AP Mask40.6	15
Object Detection	COCO	Box AP44.8	9

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord