Perceiving 3D Human-Object Spatial Arrangements from a Single Image in the Wild

About

We present a method that infers spatial arrangements and shapes of humans and objects in a globally consistent 3D scene, all from a single image in-the-wild captured in an uncontrolled environment. Notably, our method runs on datasets without any scene- or object-level 3D supervision. Our key insight is that considering humans and objects jointly gives rise to "3D common sense" constraints that can be used to resolve ambiguity. In particular, we introduce a scale loss that learns the distribution of object size from data; an occlusion-aware silhouette re-projection loss to optimize object pose; and a human-object interaction loss to capture the spatial layout of objects with which humans interact. We empirically validate that our constraints dramatically reduce the space of likely 3D spatial configurations. We demonstrate our approach on challenging, in-the-wild images of humans interacting with large objects (such as bicycles, motorcycles, and surfboards) and handheld objects (such as laptops, tennis rackets, and skateboards). We quantify the ability of our approach to recover human-object arrangements and outline remaining challenges in this relatively domain. The project webpage can be found at https://jasonyzhang.com/phosa.

Jason Y. Zhang, Sam Pepose, Hanbyul Joo, Deva Ramanan, Jitendra Malik, Angjoo Kanazawa• 2020

Related benchmarks

Task	Dataset	Result
3D human and object reconstruction	BEHAVE	CD Human12.17	11
3D human and object reconstruction	InterCap	CD (Human)11.2	11
3D human reconstruction	BEHAVE	SMPL v2v Error (cm)6.9	8
Joint Human and Object Reconstruction	BEHAVE (test)	CD (SMPL) (cm)5.758	8
3D Human-Object Interaction reconstruction	InterCap (test)	PA-CDh (cm)10.07	7
Joint Human-Object Tracking	BEHAVE extended (key frames)	SMPL Chamfer Distance12.86	6
Joint Human-Object Tracking	InterCap official (key frames)	SMPL Chamfer Distance11.2	6
Joint 3D human and object reconstruction	Open3DHOI (test)	CD Human5.342	6
3D Human-Object Interaction reconstruction	BEHAVE 4 (in-lab)	Chamfer Distance (Human)12.17	6
3D human-object reconstruction	Open3DHOI non-contact scenarios	CDhuman5.401	5

Showing 10 of 16 rows

Other info

Code

Follow for update

@wizwand_team Discord