ChipMATE: Multi-Agent Training via Reinforcement Learning for Enhanced RTL Generation

About

Existing API-based agentic systems for RTL code generation are fundamentally misaligned with industrial practice: they assume a golden testbench is available at generation time, rely on closed-source APIs incompatible with chip vendors' air-gapped security requirements, and cannot be trained on vendors' proprietary RTL codebases, leaving valuable internal data unused. Recent self-trained models address the deployment constraint but remain single-turn generators that overlook the critical role of verification in real industrial flows. To bridge these gaps, we present ChipMATE, the first self-trained multi-agent framework for RTL generation. Inspired by industrial practice where correctness emerges from cross-comparison between independently written RTL modules and reference models, ChipMATE pairs a Verilog agent with a Python reference-model agent that mutually verify each other's outputs without any golden oracle. We design a backtrack-based inference workflow to prevent error propagation across turns, and a two-stage training pipeline that first trains each agent individually to saturate its code-generation capability, then trains the team jointly to collaborate effectively. To support the training, we further build a hybrid data-generation framework that produces 64.4K high-quality reference model training samples. ChipMATE achieves 75.0\% and 80.1\% pass@1 on VerilogEval V2 with 4B and 9B base models, outperforming all existing self-trained models and even DeepSeek V4 with 1600B parameters. Our code and model weights are publicly available in https://github.com/zhongkaiyu/ChipMATE.

Zhongkai Yu, Yichen Lin, Chenyang Zhou, Yuwei Zhang, Kun Zhou, Junxia Cui, Haotian Ye, Zhengding Hu, Zaifeng Pan, Ruiyi Wang, Yujie Zhao, Hejia Zhang, Jingbo Shang, Jishen Zhao, Yufei Ding• 2026

Related benchmarks

Task	Dataset	Result
Verilog Code Generation	RTLLM v2	Pass@175.8	27
Verilog Generation	VerilogEval v2	Pass@180.1	27
Verilog Generation	ChipBench-SC	Pass@136.7	14
Verilog Generation	CVDP cid03	Pass@1 Rate40.4	14
Python reference-model generation	VerilogEval v2	Pass@182.4	10
Python reference-model generation	RTLLM v2	Pass@177.3	10
Python reference-model generation	ChipBench-SC	Pass@143.3	10
Python reference-model generation	CVDP cid03	Pass@143.3	10

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord