Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions

About

We propose a method for editing NeRF scenes with text-instructions. Given a NeRF of a scene and the collection of images used to reconstruct it, our method uses an image-conditioned diffusion model (InstructPix2Pix) to iteratively edit the input images while optimizing the underlying scene, resulting in an optimized 3D scene that respects the edit instruction. We demonstrate that our proposed method is able to edit large-scale, real-world scenes, and is able to accomplish more realistic, targeted edits than prior work.

Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa• 2023

Related benchmarks

Task	Dataset	Result
NeRF Colorization	LLFF	CF45.599	8
3D Scene Editing	3D Scene Editing Evaluation Set (test)	CLIP Similarity24.8	7
Portrait Editing	Tensor4D static scenes	CLIP Similarity0.2989	7
Super-Resolution	LLFF	PSNR20.299	6
Local 3D Editing	Evaluation dataset unseen 3D assets (test)	CLIP Similarity0.253	6
Global 3D Editing	Evaluation dataset unseen 3D assets (test)	CLIP Similarity0.239	6
Text-driven NeRF Editing	Face, Fangzhou, and Farm (test)	CLIP Dir Sim0.2021	5
Novel-view stylization	53 stylizations (Instruct-NeRF2NeRF, GaussCtrl, ScanNet++, Mip-NeRF360, and new scenes) (full evaluation set)	CLIP Direction Similarity0.098	5
Object Insertion	35 unique edits (5 scenes x 7 objects) (test)	CLIPScore0.2347	5
Stylization Semantic Alignment	Rodin 35 examples	CLIP-IQA23.93	5

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord