KAT-Coder-V2 Technical Report

About

We present KAT-Coder-V2, an agentic coding model developed by the KwaiKAT team at Kuaishou. KAT-Coder-V2 adopts a "Specialize-then-Unify" paradigm that decomposes agentic coding into five expert domains - SWE, WebCoding, Terminal, WebSearch, and General - each undergoing independent supervised fine-tuning and reinforcement learning, before being consolidated into a single model via on-policy distillation. We develop KwaiEnv, a modular infrastructure sustaining tens of thousands of concurrent sandbox instances, and scale RL training along task complexity, intent alignment, and scaffold generalization. We further propose MCLA for stabilizing MoE RL training and Tree Training for eliminating redundant computation over tree-structured trajectories with up to 6.2x speedup. KAT-Coder-V2 achieves 79.6% on SWE-bench Verified (vs. Claude Opus 4.6 at 80.8%), 88.7 on PinchBench (surpassing GLM-5 and MiniMax M2.7), ranks first across all three frontend aesthetics scenarios, and maintains strong generalist scores on Terminal-Bench Hard (46.8) and tau^2-Bench (93.9). Our model is publicly available at https://streamlake.com/product/kat-coder.

Fengxiang Li, Han Zhang, Haoyang Huang, Jinghui Wang, Jinhua Hao, Kun Yuan, Mengtong Li, Minglei Zhang, Pengcheng Xu, Wenhao Zhuang, Yizhen Shao, Zongxian Feng, Can Tang, Chao Wang, Chengxiao Tong, Fan Yang, Gang Xiong, Haixuan Gao, Han Gao, Hao Wang, Haochen Liu, Hongliang Sun, Jiabao Li, Jingwen Chang, Jun Du, Junyi Peng, Leizhen Cui, Meimei Jing, Mingqi Wu, Shangpeng Yan, Shaotong Qi, Suzhe Xu, Wenxuan Zhao, Xianda Sun, Xuan Xie, Yanbo Wang, Yao Xia, Yinghan Cui, Yingpeng Chen, Yong Wang, Yuze Shi, Zhiwei Shen, Ziyu Wang, Ming Sun, Lin Ye, Bin Chen• 2026

Related benchmarks

Task	Dataset	Result
Software Engineering	SWE-bench Verified	Resolution Rate79.6	32
Real-World Agent	Claw-Eval	Average Score73.4	22
Real-World Agent	PinchBench	Average Score81.9	15
Code Agent	Terminal-Bench Hard	Score46.8	12
General Task (Agentic Coding)	tau2-Bench Telecom	Score93.9	6
General Task (Agentic Coding)	AA-LCR	Score68	6
General Task (Agentic Coding)	IFBench	Score67	6
Software Engineering Tasks	SWE-bench Multilingual (test)	Resolution Rate (%)75.4	4
Software Engineering Tasks	SWE-rebench subset V2 (test)	Resolved Rate43.3	4
Frontend Aesthetics Generation	Frontend Aesthetics Generation Landing Page	Aesthetic Score59.8	3

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord