Category Query Learning for Human-Object Interaction Classification
About
Unlike most previous HOI methods that focus on learning better human-object features, we propose a novel and complementary approach called category query learning. Such queries are explicitly associated to interaction categories, converted to image specific category representation via a transformer decoder, and learnt via an auxiliary image-level classification task. This idea is motivated by an earlier multi-label image classification method, but is for the first time applied for the challenging human-object interaction classification task. Our method is simple, general and effective. It is validated on three representative HOI baselines and achieves new state-of-the-art results on two benchmarks.
Chi Xie, Fangao Zeng, Yue Hu, Shuang Liang, Yichen Wei• 2023
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Human-Object Interaction Detection | HICO-DET | mAP (Full)38.82 | 233 | |
| Human-Object Interaction Detection | HICO-DET Known Object (test) | mAP (Full)38.82 | 112 | |
| Human-Object Interaction Detection | V-COCO 1.0 (test) | AP_role (#1)66.5 | 76 | |
| Human-Object Interaction Detection | V-COCO | AP^1 Role66.5 | 65 | |
| HOI Detection | V-COCO | AP Role 166.4 | 40 | |
| HOI Detection | HICO-DET v1.0 (test) | mAP (Default, Full)36.03 | 29 | |
| HOI Detection | HICO-DET | mAP (Default Full)35.36 | 21 | |
| Human-Object Interaction Detection | HICO-DET (train) | Inference Time (Hour)29.7 | 8 |
Showing 8 of 8 rows