When Digital Twins Meet Large Language Models: Realistic, Interactive, and Editable Simulation for Autonomous Driving

About

Simulation frameworks have been key enablers for the development and validation of autonomous driving systems. However, existing methods struggle to comprehensively address the autonomy-oriented requirements of balancing: (i) dynamical fidelity, (ii) photorealistic rendering, (iii) context-relevant scenario orchestration, and (iv) real-time performance. To address these limitations, we present a unified framework for creating and curating high-fidelity digital twins to accelerate advancements in autonomous driving research. Our framework leverages a mix of physics-based and data-driven techniques for developing and simulating digital twins of autonomous vehicles and their operating environments. It is capable of reconstructing real-world scenes and assets with geometric and photorealistic accuracy (~97% structural similarity) and infusing them with physical properties to enable real-time (>60 Hz) dynamical simulation of the ensuing driving scenarios. Additionally, it incorporates a large language model (LLM) interface to flexibly edit the driving scenarios online via natural language prompts, with ~85% generalizability and ~95% repeatability. Finally, an optional vision language model (VLM) provides ~80% visual enhancement by blending the hybrid scene composition.

Tanmay Vilas Samak, Chinmay Vilas Samak, Bing Li, Venkat Krovi• 2025

Related benchmarks

Task	Dataset	Result	Rank
Scene Reconstruction	CU-ICAR	CCD3.76e-4		4
Scenario Reconfiguration	Scenario Reconfiguration Evaluation Set 100 trials, 7 tasks, 4 prompt grades	--		3

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord