Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Neural-Logic Human-Object Interaction Detection

About

The interaction decoder utilized in prevalent Transformer-based HOI detectors typically accepts pre-composed human-object pairs as inputs. Though achieving remarkable performance, such paradigm lacks feasibility and cannot explore novel combinations over entities during decoding. We present L OGIC HOI, a new HOI detector that leverages neural-logic reasoning and Transformer to infer feasible interactions between entities. Specifically, we modify the self-attention mechanism in vanilla Transformer, enabling it to reason over the <human, action, object> triplet and constitute novel interactions. Meanwhile, such reasoning process is guided by two crucial properties for understanding HOI: affordances (the potential actions an object can facilitate) and proxemics (the spatial relations between humans and objects). We formulate these two properties in first-order logic and ground them into continuous space to constrain the learning process of our approach, leading to improved performance and zero-shot generalization capabilities. We evaluate L OGIC HOI on V-COCO and HICO-DET under both normal and zero-shot setups, achieving significant improvements over existing methods.

Liulei Li, Jianan Wei, Wenguan Wang, Yi Yang• 2023

Related benchmarks

TaskDatasetResultRank
Human-Object Interaction DetectionHICO-DET (test)
mAP (full)35.47
544
Human-Object Interaction DetectionV-COCO (test)
AP (Role, Scenario 1)64.4
270
Human-Object Interaction DetectionHICO-DET
mAP (Full)35.47
252
Human-Object Interaction DetectionHICO-DET (Rare First Unseen Combination (RF-UC))
mAP (Full)33.17
77
Human-Object Interaction DetectionHICO-DET (NF-UC)
mAP (Full)27.95
56
Human-Object Interaction DetectionHICO-DET Non-rare First Unseen Composition (NF-UC)
AP (Unseen)26.84
49
Human-Object Interaction DetectionHICO-DET (UO)
mAP (Full)33.17
47
Human-Object Interaction DetectionV-COCO
AP (Role)65.6
23
Human-Object Interaction DetectionHICO-DET computed-box setting
mAP (Full)2.16
23
Human-Object Interaction DetectionHICO-DET closed setting
Performance Score (Rare)32.03
18
Showing 10 of 10 rows

Other info

Follow for update