Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inducing Programmatic Skills for Agentic Tasks

About

To succeed in common digital tasks such as web navigation, agents must carry out a variety of specialized tasks such as searching for products or planning a travel route. To tackle these tasks, agents can bootstrap themselves by learning task-specific skills online through interaction with the web environment. In this work, we demonstrate that programs are an effective representation for skills. We propose agent skill induction (ASI), which allows agents to adapt themselves by inducing, verifying, and utilizing program-based skills on the fly. We start with an evaluation on the WebArena agent benchmark and show that ASI outperforms the static baseline agent and its text-skill counterpart by 23.5% and 11.3% in success rate, mainly thanks to the programmatic verification guarantee during the induction phase. ASI also improves efficiency by reducing 10.7-15.3% of the steps over baselines, by composing primitive actions (e.g., click) into higher-level skills (e.g., search product). We then highlight the efficacy of ASI in remaining efficient and accurate under scaled-up web activities. Finally, we examine the generalizability of induced skills when transferring between websites, and find that ASI can effectively reuse common skills, while also updating incompatible skills to versatile website changes.

Zora Zhiruo Wang, Apurva Gandhi, Graham Neubig, Daniel Fried• 2025

Related benchmarks

TaskDatasetResultRank
Web navigation and task completionWebArena (test)
Average Task Completion55.8
137
Web Agent NavigationMIND2WEB Cross-Task 1.0
Success Rate47
26
Web Agent NavigationMIND2WEB Cross-Domain 1.0
Success Rate45.1
26
Web Navigation Task CompletionMind2Web Cross-Task
Success Rate62.1
18
Web Navigation Task CompletionMind2Web (Cross-website 177)
Success Rate65.1
14
Web navigationMIND2WEB Cross-Website 1.0
Success Rate46.2
10
Web Navigation Task CompletionMind2Web Cross-Domain
Success Rate67.3
10
Web navigationMind2Web Cross-Domain
Success Rate (Acc)62.1
8
Web navigationMind2Web Cross-Task
Success Rate59.4
5
Web navigationMind2Web (Cross-Website)
Success Rate (Acc)58.7
5
Showing 10 of 10 rows

Other info

Follow for update