Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Event Camera Data Dense Pre-training

About

This paper introduces a self-supervised learning framework designed for pre-training neural networks tailored to dense prediction tasks using event camera data. Our approach utilizes solely event data for training. Transferring achievements from dense RGB pre-training directly to event camera data yields subpar performance. This is attributed to the spatial sparsity inherent in an event image (converted from event data), where many pixels do not contain information. To mitigate this sparsity issue, we encode an event image into event patch features, automatically mine contextual similarity relationships among patches, group the patch features into distinctive contexts, and enforce context-to-context similarities to learn discriminative event features. For training our framework, we curate a synthetic event camera dataset featuring diverse scene and motion patterns. Transfer learning performance on downstream dense prediction tasks illustrates the superiority of our method over state-of-the-art approaches.

Yan Yang, Liyuan Pan, Liu Liu• 2023

Related benchmarks

TaskDatasetResultRank
Optical FlowMVSEC 1.0 (indoor_flying1)
EPE0.36
52
Semantic segmentationDDD17
mIoU62.56
50
Optical FlowMVSEC 1.0 (indoor_flying3)
EPE0.42
46
Optical FlowMVSEC 1.0 (indoor_flying2)
EPE0.45
46
Semantic segmentationDDD17 (test)
mIoU55.73
46
Semantic segmentationDSEC (test)
mIoU56.38
34
Semantic segmentationDSEC-Semantic
mIoU61.25
20
Monocular Depth EstimationDSEC-Depth
RMSE9.477
20
Monocular Depth EstimationMVSEC Depth
RMSE6.957
20
object recognitionN-ImageNet 1.0 (test)
Top-1 Accuracy51.4
13
Showing 10 of 12 rows

Other info

Follow for update