R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science

About

Recent advances in AI and ML have transformed data science, yet increasing complexity and expertise requirements continue to hinder progress. Although crowd-sourcing platforms alleviate some challenges, high-level machine learning engineering (MLE) tasks remain labor-intensive and iterative. We introduce R&D-Agent, a comprehensive, decoupled, and extensible framework that formalizes the MLE process. R&D-Agent defines the MLE workflow into two phases and six components, turning agent design for MLE from ad-hoc craftsmanship into a principled, testable process. Although several existing agents report promising gains on their chosen components, they can mostly be summarized as a partial optimization from our framework's simple baseline. Inspired by human experts, we designed efficient and effective agents within this framework that achieve state-of-the-art performance. Evaluated on MLE-Bench, the agent built on R&D-Agent ranks as the top-performing machine learning engineering agent, achieving 35.1% any medal rate, demonstrating the ability of the framework to speed up innovation and improve accuracy across a wide range of data science applications. We have open-sourced R&D-Agent on GitHub: https://github.com/microsoft/RD-Agent.

Xu Yang, Xiao Yang, Shikai Fang, Yifei Zhang, Jian Wang, Bowen Xian, Qizheng Li, Jingyuan Li, Minrui Xu, Yuante Li, Haoran Pan, Yuge Zhang, Weiqing Liu, Yelong Shen, Weizhu Chen, Jiang Bian• 2025

Related benchmarks

Task	Dataset	Result
Autonomous Machine Learning Engineering	MLE-Bench Lite	Any Medal Rate68.18	57
Automated Machine Learning	MLE-Bench	Valid Submission Rate53.33	14
Machine learning engineering	MLE-Bench Lite	Any Medal (%)48.18	13
Automated AI Research	MLE-Bench official (full)	Valid Submission Rate53.3	13
AutoML	KompeteAI-Bench Contemporary part	Score6.9	8
Machine learning engineering	MLE-bench Low	Medal Rate68.18	5
Machine learning engineering	MLE-bench (All)	Medal Rate35.11	5
Machine learning engineering	MLE-bench Medium	Medal Rate21.05	5
Machine learning engineering	MLE-bench Hard	Medal Rate22.22	5
Virtual Cell Modeling	20 virtual cell modeling trials	Preprocess Error45	3

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord