EZBlender: Efficient 3D Editing with Plan-and-ReAct Agent

About

As a cornerstone of the modern digital economy, 3D modeling and rendering demand substantial resources and manual effort when scene editing is performed in the traditional manner. Despite recent progress in VLM-based agents for 3D editing, the fundamental trade-off between editing precision and agent responsiveness remains unresolved. To overcome these limitations, we present EZBlender, a Blender agent with a hybrid framework that combines planning-based task decomposition and reactive local autonomy for efficient human AI collaboration and semantically faithful 3D editing. Specifically, this unexplored Plan-and-ReAct design not only preserves editing quality but also significantly reduces latency and computational cost. To further validate the efficiency and effectiveness of the proposed edge-autonomy architecture, we construct a dedicated multi-tasking benchmark that has not been systematically investigated in prior research. In addition, we provide a comprehensive analysis of language model preference, system responsiveness, and economic efficiency.

Hao Wang, Wenhui Zhu, Shao Tang, Zhipeng Wang, Xuanzhao Dong, Xin Li, Xiwen Chen, Ashish Bastola, Xinhao Huang, Yalin Wang, Abolfazl Razi• 2026

Related benchmarks

Task	Dataset	Result
3D Scene Editing	3D Scene Editing	Prompt Tokens4.62e+3	5
3D Scene Editing	3D Editing Benchmark Scenario S1	TCR (%)78.67	3
3D Scene Editing	3D Editing Benchmark (Scenario S2)	TCR84.67	3
3D Scene Editing	3D Editing Benchmark Scenario S3	TCR60.66	3
3D Scene Editing	3D Editing Benchmark Scenario S4	TCR61.33	3
3D Scene Editing	3D Editing Benchmark Scenario S5	TCR58.66	3
Text-Prompt Editing	BlenderGym (test)	Shapekey CLIP Score30.21	3
Visual-Prompt Editing	BlenderGym VLM (test)	Blend Shape CLIP Sim0.9816	3
3D Scene Editing	15 distinct single-task prompts	LLM Time20.58	3

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord