ASI-Evolve: AI Accelerates AI

About

Can AI accelerate the development of AI itself? While recent agentic systems have shown strong performance on well-scoped tasks with rapid feedback, it remains unclear whether they can tackle the costly, long-horizon, and weakly supervised research loops that drive real AI progress. We present ASI-Evolve, an agentic framework for AI-for-AI research that closes this loop through a learn-design-experiment-analyze cycle. ASI-Evolve augments standard evolutionary agents with two key components: a cognition base that injects accumulated human priors into each round of exploration, and a dedicated analyzer that distills complex experimental outcomes into reusable insights for future iterations. To our knowledge, ASI-Evolve is the first unified framework to demonstrate AI-driven discovery across three central components of AI development: data, architectures, and learning algorithms. In neural architecture design, it discovered 105 SOTA linear attention architectures, with the best discovered model surpassing DeltaNet by +0.97 points, nearly 3x the gain of recent human-designed improvements. In pretraining data curation, the evolved pipeline improves average benchmark performance by +3.96 points, with gains exceeding 18 points on MMLU. In reinforcement learning algorithm design, discovered algorithms outperform GRPO by up to +12.5 points on AMC32, +11.67 points on AIME24, and +5.04 points on OlympiadBench. We further provide initial evidence that this AI-for-AI paradigm can transfer beyond the AI stack through experiments in mathematics and biomedicine. Together, these results suggest that ASI-Evolve represents a promising step toward enabling AI to accelerate AI across the foundational stages of development, offering early evidence for the feasibility of closed-loop AI research.

Weixian Xu, Tiantian Mi, Yixiu Liu, Yang Nan, Zhimeng Zhou, Lyumanshan Ye, Lin Zhang, Yu Qiao, Pengfei Liu• 2026

Related benchmarks

Task	Dataset	Result
Commonsense Reasoning	WinoGrande	--	1442
Question Answering	ARC Easy	--	597
Physical Interaction Question Answering	PIQA	Accuracy76.8	415
Logical reasoning	BBH	--	249
Bias Evaluation	BBQ	Accuracy31.46	171
Social Commonsense Reasoning	SocialIQA	Accuracy43.58	143
Multitask Knowledge	MMLU	Accuracy46.13	92
Multiple-choice Question Answering	MedMCQA	Accuracy40.97	42
Reasoning	DROP	Score19.48	42
Drug-Target Interaction Prediction	BIOSNAP	--	28

Showing 10 of 37 rows

Other info

GitHub

Follow for update

@wizwand_team Discord