Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fara-7B: An Efficient Agentic Model for Computer Use

About

Progress in computer use agents (CUAs) has been constrained by the absence of large and high-quality datasets that capture how humans interact with a computer. While LLMs have thrived on abundant textual data, no comparable corpus exists for CUA trajectories. To address these gaps, we introduce FaraGen, a novel synthetic data generation system for multi-step web tasks. FaraGen can propose diverse tasks from frequently used websites, generate multiple solution attempts, and filter successful trajectories using multiple verifiers. It achieves high throughput, yield, and diversity for multi-step web tasks, producing verified trajectories at approximately $1 each. We use this data to train Fara-7B, a native CUA model that perceives the computer using only screenshots, executes actions via predicted coordinates, and is small enough to run on-device. We find that Fara-7B outperforms other CUA models of comparable size on benchmarks like WebVoyager, Online-Mind2Web, and WebTailBench -- our novel benchmark that better captures under-represented web tasks in pre-existing benchmarks. Furthermore, Fara-7B is competitive with much larger frontier models, illustrating key benefits of scalable data generation systems in advancing small efficient agentic models. We are making Fara-7B open-weight on Microsoft Foundry and HuggingFace, and we are releasing WebTailBench.

Ahmed Awadallah, Yash Lara, Raghav Magazine, Hussein Mozannar, Akshay Nambi, Yash Pandya, Aravind Rajeswaran, Corby Rosset, Alexey Taymanov, Vibhav Vineet, Spencer Whitehead, Andrew Zhao• 2025

Related benchmarks

TaskDatasetResultRank
GroundingScreenSpot v2
Accuracy89.3
32
Browser UseWebVoyager
Success Rate73.5
14
Browser UseWebTailBench
Success Rate38.4
13
Browser UseDeepShop
Success Rate0.262
13
GUI planning semantic consistencyGUIGuard-Bench Overall 1.0 (Mixed)
Black Mask Score2.02
7
GUI planning semantic consistencyGUIGuard-Bench Android 1.0
Android Avg.2.34
7
GUI planning semantic consistencyGUIGuard-Bench PC 1.0
PC Avg.169
7
Visual GroundingScreenSpot
Accuracy86.7
6
GUI Trajectory Dataset ComparisonGUI Trajectory Datasets
Website Pages Count7.01e+4
5
Showing 9 of 9 rows

Other info

Follow for update