UI-Venus-1.5 Technical Report

About

GUI agents have emerged as a powerful paradigm for automating interactions in digital environments, yet achieving both broad generality and consistently strong task performance remains challenging. In this report, we present UI-Venus-1.5, a unified, end-to-end GUI Agent designed for robust real-world applications. The proposed model family comprises two dense variants (2B and 8B) and one mixture-of-experts variant (30B-A3B) to meet various downstream application scenarios. Compared to our previous version, UI-Venus-1.5 introduces three key technical advances: (1) a comprehensive Mid-Training stage leveraging 10 billion tokens across 30+ datasets to establish foundational GUI semantics; (2) Online Reinforcement Learning with full-trajectory rollouts, aligning training objectives with long-horizon, dynamic navigation in large-scale environments; and (3) a single unified GUI Agent constructed via Model Merging, which synthesizes domain-specific models (grounding, web, and mobile) into one cohesive checkpoint. Extensive evaluations demonstrate that UI-Venus-1.5 establishes new state-of-the-art performance on benchmarks such as ScreenSpot-Pro (69.6%), VenusBench-GD (75.0%), and AndroidWorld (77.6%), significantly outperforming previous strong baselines. In addition, UI-Venus-1.5 demonstrates robust navigation capabilities across a variety of Chinese mobile apps, effectively executing user instructions in real-world scenarios. Code: https://github.com/inclusionAI/UI-Venus; Model: https://huggingface.co/collections/inclusionAI/ui-venus

Venus Team, Changlong Gao, Zhangxuan Gu, Yulin Liu, Xinyu Qiu, Shuheng Shen, Yue Wen, Tianyu Xia, Zhenyu Xu, Zhengwen Zeng, Beitong Zhou, Xingran Zhou, Weizhi Chen, Sunhao Dai, Jingya Dou, Yichen Gong, Yuan Guo, Zhenlin Guo, Feng Li, Qian Li, Jinzhen Lin, Yuqi Zhou, Linchao Zhu, Liang Chen, Zhenyu Guo, Changhua Meng, Weiqiang Wang• 2026

Related benchmarks

Task	Dataset	Result
GUI Grounding	ScreenSpot Pro	Average Score57.7	458
GUI Grounding	ScreenSpot v2	Avg Accuracy95.9	371
GUI Grounding	ScreenSpot Pro	Accuracy69.6	195
GUI Agent Task	AndroidWorld	Success Rate65.9	188
GUI Grounding	OSWorld-G	Average Score70.6	144
Mobile Task Automation	AndroidWorld (test)	Average Success Rate0.776	119
Grounding	ScreenSpot v2	--	47
Web navigation	Mind2Web	Overall Success Rate5.88	41
Mobile GUI Automation	AndroidLab	Success Rate55.1	25
Action Prediction	AndroidControl High v2	Pass@1 Step Accuracy61.06	22

Showing 10 of 24 rows

Other info

GitHub

Follow for update

@wizwand_team Discord