Explaining Object Detectors via Collective Contribution of Pixels

About

Visual explanations for object detectors are crucial for enhancing their reliability. Object detectors identify and localize instances by assessing multiple visual features collectively. When generating explanations, overlooking these collective influences in detections may lead to missing compositional cues or capturing spurious correlations. However, existing methods typically focus solely on individual pixel contributions, neglecting the collective contribution of multiple pixels. To address this limitation, we propose a game-theoretic method based on Shapley values and interactions to explicitly capture both individual and collective pixel contributions. Our method provides explanations for both bounding box localization and class determination, highlighting regions crucial for detection. Extensive experiments demonstrate that the proposed method identifies important regions more accurately than state-of-the-art methods. The code is available at https://github.com/tttt-0814/VX-CODE

Toshinori Yamauchi, Hiroshi Kera, Kazuhiko Kawamoto• 2024

Related benchmarks

Task	Dataset	Result
Visual Explanation	MS-COCO	--	30
Object Detection Explanation Faithfulness	MS-COCO	Insertion92.6	25
Faithfulness of identified regions	Pascal VOC	Insertion85	18
Object Detection	MS-COCO	Insertion Score92.2	11
Visual Explanation Faithfulness	MS-COCO Misclassification failure cases (test)	Insertion (Ins)73.8	9
Visual Explanation Faithfulness	MS-COCO Mislocalization failure cases (test)	Insertion Score78.7	9
Energy-based Pointing Game	MS-COCO	EPG (B)64.4	8
Pointing game	MS-COCO	PG (B)96.5	8
Interaction Score Analysis	COCO (300 instances)	Interaction Score5.1	7
Object Detection Explanation Faithfulness	COCO 100 detected instances	Insertion Score (Ins)93.3	7

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord