Solving Physics Olympiad via Reinforcement Learning on Physics Simulators

About

We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In contrast, other sciences such as physics lack large-scale QA datasets to effectively train reasoning-capable models. In this work, we show that physics simulators can serve as a powerful alternative source of supervision for training LLMs for physical reasoning. We generate random scenes in physics engines, create synthetic question-answer pairs from simulated interactions, and train LLMs using reinforcement learning on this synthetic data. Our models exhibit zero-shot sim-to-real transfer to real-world physics benchmarks: for example, training solely on synthetic simulated data improves performance on IPhO (International Physics Olympiad) problems by 5-10 percentage points across model sizes. These results demonstrate that physics simulators can act as scalable data generators, enabling LLMs to acquire deep physical reasoning skills beyond the limitations of internet-scale QA data. Code available at: https://sim2reason.github.io/.

Mihir Prabhudesai, Aryan Satpathy, Yangmin Li, Zheyang Qin, Nikash Bhardwaj, Amir Zadeh, Chuan Li, Katerina Fragkiadaki, Deepak Pathak• 2026

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	MATH 500	--	116
Physics Reasoning	Synthetic Numeric	Accuracy21.9	10
Physics Reasoning	Synthetic Symbolic	Accuracy10.4	10
Physics Reasoning	HCV	Accuracy59	10
Physics Reasoning	IPhO Mechanics	Accuracy40	10
Question Answering	IPhO	IPhO Accuracy40	4
Mathematical & Scientific Reasoning	OlympiadBench	Mean Accuracy44.53	2
Physics Reasoning	Physics	Mean Accuracy43.09	2
Scientific Reasoning	JEEBench	Mean Accuracy52.28	2

Showing 9 of 9 rows

Other info

GitHub

Follow for update

@wizwand_team Discord