Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Spatially Conditioned Graphs for Detecting Human-Object Interactions

About

We address the problem of detecting human-object interactions in images using graphical neural networks. Unlike conventional methods, where nodes send scaled but otherwise identical messages to each of their neighbours, we propose to condition messages between pairs of nodes on their spatial relationships, resulting in different messages going to neighbours of the same node. To this end, we explore various ways of applying spatial conditioning under a multi-branch structure. Through extensive experimentation we demonstrate the advantages of spatial conditioning for the computation of the adjacency structure, messages and the refined graph features. In particular, we empirically show that as the quality of the bounding boxes increases, their coarse appearance features contribute relatively less to the disambiguation of interactions compared to the spatial information. Our method achieves an mAP of 31.33% on HICO-DET and 54.2% on V-COCO, significantly outperforming state-of-the-art on fine-tuned detections.

Frederic Z. Zhang, Dylan Campbell, Stephen Gould• 2020

Related benchmarks

TaskDatasetResultRank
Human-Object Interaction DetectionHICO-DET (test)
mAP (full)51.53
493
Human-Object Interaction DetectionV-COCO (test)
AP (Role, Scenario 1)54.2
270
Human-Object Interaction DetectionHICO-DET
mAP (Full)34.37
233
Human-Object Interaction DetectionHICO-DET Known Object (test)
mAP (Full)51.75
112
Human-Object Interaction DetectionV-COCO 1.0 (test)
AP_role (#1)54.2
76
HOI DetectionHICO-DET (test)
Box mAP (Full)31.3
32
Human-Object Interaction DetectionV-COCO
Box mAP (Scenario 1)54.2
32
HOI DetectionHICO-DET v1.0 (test)
mAP (Default, Full)29.26
29
HOI DetectionV-COCO v1 (test)
AP Role (Scenario 1)54.2
25
HOI SegmentationHICO-DET (test)
mask mAP (Full)31.3
12
Showing 10 of 12 rows

Other info

Code

Follow for update