Grounding LLMs in Scientific Discovery via Embodied Actions

About

Large Language Models (LLMs) have shown significant potential in scientific discovery but struggle to bridge the gap between theoretical reasoning and verifiable physical simulation. Existing solutions operate in a passive "execute-then-response" loop and thus lacks runtime perception, obscuring agents to transient anomalies (e.g., numerical instability or diverging oscillations). To address this limitation, we propose EmbodiedAct, a framework that transforms established scientific software into active embodied agents by grounding LLMs in embodied actions with a tight perception-execution loop. We instantiate EmbodiedAct within MATLAB and evaluate it on complex engineering design and scientific modeling tasks. Extensive experiments show that EmbodiedAct significantly outperforms existing baselines, achieving SOTA performance by ensuring satisfactory reliability and stability in long-horizon simulations and enhanced accuracy in scientific modeling.

Bo Zhang, Jinfeng Zhou, Yuxuan Chen, Jianing Yin, Minlie Huang, Hongning Wang• 2026

Related benchmarks

Task	Dataset	Result
Engineering Design	EngDesign Core	Overall Average Score70.6	24
Engineering Design	EngDesign Extended	Overall Average Score65.4	24
Scientific problem solving	SciBench-107	Atkins Score62.5	24
Multi-step Reasoning	multi-step reasoning tasks	Average Score61.6	3

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord