Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

OneNet: A Channel-Wise 1D Convolutional U-Net

About

Many state-of-the-art computer vision architectures leverage U-Net for its adaptability and efficient feature extraction. However, the multi-resolution convolutional design often leads to significant computational demands, limiting deployment on edge devices. We present a streamlined alternative: a 1D convolutional encoder that retains accuracy while enhancing its suitability for edge applications. Our novel encoder architecture achieves semantic segmentation through channel-wise 1D convolutions combined with pixel-unshuffle operations. By incorporating PixelShuffle, known for improving accuracy in super-resolution tasks while reducing computational load, OneNet captures spatial relationships without requiring 2D convolutions, reducing parameters by up to 47%. Additionally, we explore a fully 1D encoder-decoder that achieves a 71% reduction in size, albeit with some accuracy loss. We benchmark our approach against U-Net variants across diverse mask-generation tasks, demonstrating that it preserves accuracy effectively. Although focused on image segmentation, this architecture is adaptable to other convolutional applications. Code for the project is available at https://github.com/shbyun080/OneNet .

Sanghyun Byun, Kayvan Shah, Ayushi Gang, Christopher Apton, Jacob Song, Woo Seong Chung• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationVOC
mIoU16
44
Semantic segmentationInput tensor (1, 3, 256, 256)
Params (M)9.08
9
Semantic segmentationMSD Heart
L_CE0.0041
6
Semantic segmentationMSD Brain
Log Loss (CE)0.0363
6
Semantic segmentationMSD Lung
L_CE7.00e-4
6
Semantic segmentationOxford Pet full mask PET_F
L_CE2.713
6
Semantic segmentationOxford Pet small mask version - PET_S
CE Loss0.309
6
Showing 7 of 7 rows

Other info

Code

Follow for update