Hand-Object Interaction Image Generation

About

In this work, we are dedicated to a new task, i.e., hand-object interaction image generation, which aims to conditionally generate the hand-object image under the given hand, object and their interaction status. This task is challenging and research-worthy in many potential application scenarios, such as AR/VR games and online shopping, etc. To address this problem, we propose a novel HOGAN framework, which utilizes the expressive model-aware hand-object representation and leverages its inherent topology to build the unified surface space. In this space, we explicitly consider the complex self- and mutual occlusion during interaction. During final image synthesis, we consider different characteristics of hand and object and generate the target image in a split-and-combine manner. For evaluation, we build a comprehensive protocol to access both the fidelity and structure preservation of the generated image. Extensive experiments on two large-scale datasets, i.e., HO3Dv3 and DexYCB, demonstrate the effectiveness and superiority of our framework both quantitatively and qualitatively. The project page is available at https://play-with-hoi-generation.github.io/.

Hezhen Hu, Weilun Wang, Wengang Zhou, Houqiang Li• 2022

Related benchmarks

Task	Dataset	Result
3D Hand Reconstruction	HO3D v3	PA-MPJPE11.8	25
Hand-object interaction image generation	DexYCB	FID30.1	7
Hand-object manipulation video generation	DexYCB	FID64.74	4
Object Structure Preservation	HO3D v3	Sugar Box Preservation Score52.1	4
Hand-object interaction image generation	HO3D v3	FID41.3	3

Showing 5 of 5 rows

Other info

Code

Follow for update

@wizwand_team Discord