Physical Property Understanding from Language-Embedded Feature Fields

About

Can computers perceive the physical properties of objects solely through vision? Research in cognitive science and vision science has shown that humans excel at identifying materials and estimating their physical properties based purely on visual appearance. In this paper, we present a novel approach for dense prediction of the physical properties of objects using a collection of images. Inspired by how humans reason about physics through vision, we leverage large language models to propose candidate materials for each object. We then construct a language-embedded point cloud and estimate the physical properties of each 3D point using a zero-shot kernel regression approach. Our method is accurate, annotation-free, and applicable to any object in the open world. Experiments demonstrate the effectiveness of the proposed approach in various physical property reasoning tasks, such as estimating the mass of common objects, as well as other properties like friction and hardness.

Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang• 2024

Related benchmarks

Task	Dataset	Result
Mass estimation	ABO-500 (test)	ADE8.73	15
Inference Time	GVM (test)	Inference Time (s)1.45e+3	11
Voxel Mechanical Property Estimation	Voxelized 3D Objects (test)	Young's Modulus ALDE2.5719	8
Mechanical Property Estimation	Released dataset public (test)	Young's Modulus ALDE2.8	4
Per-point kinetic friction coefficient estimation	in-house collected dataset (6 points, 6 objects) (test)	ADE0.155	3
Material Property Prediction	PixieVerse	Preprocessing Time65	3
Per-point Shore Hardness Estimation	real-world in-house collected dataset Shore hardness	Average Deviation Error34.295	3

Showing 7 of 7 rows

Other info

Code

Follow for update

@wizwand_team Discord