iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection

About

Recent years have witnessed rapid progress in detecting and recognizing individual object instances. To understand the situation in a scene, however, computers need to recognize how humans interact with surrounding objects. In this paper, we tackle the challenging task of detecting human-object interactions (HOI). Our core idea is that the appearance of a person or an object instance contains informative cues on which relevant parts of an image to attend to for facilitating interaction prediction. To exploit these cues, we propose an instance-centric attention module that learns to dynamically highlight regions in an image conditioned on the appearance of each instance. Such an attention-based network allows us to selectively aggregate features relevant for recognizing HOIs. We validate the efficacy of the proposed network on the Verb in COCO and HICO-DET datasets and show that our approach compares favorably with the state-of-the-arts.

Chen Gao, Yuliang Zou, Jia-Bin Huang• 2018

Related benchmarks

Task	Dataset	Result
Human-Object Interaction Detection	HICO-DET (test)	mAP (full)33.38	544
Human-Object Interaction Detection	V-COCO (test)	AP (Role, Scenario 1)45.3	270
Human-Object Interaction Detection	HICO-DET	mAP (Full)14.8	263
Human-Object Interaction Detection	HICO-DET Known Object (test)	mAP (Full)16.26	118
Human-Object Interaction Detection	V-COCO 1.0 (test)	AP_role (#1)45.3	76
Human-Object Interaction Detection	V-COCO	AP^1 Role45.3	65
Human-Object Interaction Detection	V-COCO	AP Role (Scenario 1)45.3	53
HOI Detection	V-COCO	AP Role 145.3	40
Human-Object Interaction Detection	HICO-DET 1 (test)	Full mAP16.26	33
Human-Object Interaction Detection	V-COCO Scenario 1 1.0	AP (Role)45.3	32

Showing 10 of 22 rows

Other info

Code

Follow for update

@wizwand_team Discord