AutoWebGLM: A Large Language Model-based Web Navigating Agent

About

Large language models (LLMs) have fueled many intelligent web agents, but most existing ones perform far from satisfying in real-world web navigation tasks due to three factors: (1) the complexity of HTML text data (2) versatility of actions on webpages, and (3) task difficulty due to the open-domain nature of the web. In light of these challenges, we develop the open AutoWebGLM based on ChatGLM3-6B. AutoWebGLM can serve as a powerful automated web navigation agent that outperform GPT-4. Inspired by human browsing patterns, we first design an HTML simplification algorithm to represent webpages with vital information preserved succinctly. We then employ a hybrid human-AI method to build web browsing data for curriculum training. Finally, we bootstrap the model by reinforcement learning and rejection sampling to further facilitate webpage comprehension, browser operations, and efficient task decomposition by itself. For comprehensive evaluation, we establish a bilingual benchmark -- AutoWebBench -- for real-world web navigation tasks. We evaluate AutoWebGLM across diverse web navigation benchmarks, demonstrating its potential to tackle challenging tasks in real environments. Related code, model, and data are released at \url{https://github.com/THUDM/AutoWebGLM}.

Hanyu Lai, Xiao Liu, Iat Long Iong, Shuntian Yao, Yuxuan Chen, Pengbo Shen, Hao Yu, Hanchen Zhang, Xiaohan Zhang, Yuxiao Dong, Jie Tang• 2024

Related benchmarks

Task	Dataset	Result
Web navigation	WebArena	Overall Success Rate18.2	138
Web navigation and task completion	WebArena (test)	Average Task Completion18.2	137
Mobile GUI Automation	AndroidWorld	Overall Success Rate10.43	68
GUI Navigation	Multimodal-Mind2Web Cross-Website	Step Success Rate56.4	37
GUI Navigation	Multimodal-Mind2Web Cross-Task	Step Success Rate66.4	32
GUI Navigation	Multimodal-Mind2Web Cross-Domain	Step Success Rate55.8	32
GUI Agent Planning and Execution	WebArena	Success Rate (Gitlab)9.52	32

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord