Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Surfer-H Meets Holo1: Cost-Efficient Web Agent Powered by Open Weights

About

We present Surfer-H, a cost-efficient web agent that integrates Vision-Language Models (VLM) to perform user-defined tasks on the web. We pair it with Holo1, a new open-weight collection of VLMs specialized in web navigation and information extraction. Holo1 was trained on carefully curated data sources, including open-access web content, synthetic examples, and self-produced agentic data. Holo1 tops generalist User Interface (UI) benchmarks as well as our new web UI localization benchmark, WebClick. When powered by Holo1, Surfer-H achieves a 92.2% state-of-the-art performance on WebVoyager, striking a Pareto-optimal balance between accuracy and cost-efficiency. To accelerate research advancement in agentic systems, we are open-sourcing both our WebClick evaluation dataset and the Holo1 model weights.

Mathieu Andreux, Breno Baldas Skuk, Hamza Benchekroun, Emilien Bir\'e, Antoine Bonnet, Riaz Bordie, Nathan Bout, Matthias Brunel, Pierre-Louis Cedoz, Antoine Chassang, Micka\"el Chen, Alexandra D. Constantinou, Antoine d'Andign\'e, Hubert de La Jonqui\`ere, Aur\'elien Delfosse, Ludovic Denoyer, Alexis Deprez, Augustin Derupti, Michael Eickenberg, Math\"is Federico, Charles Kantor, Xavier Koegler, Yann Labb\'e, Matthew C. H. Lee, Erwan Le Jumeau de Kergaradec, Amir Mahla, Avshalom Manevich, Adrien Maret, Charles Masson, Rafa\"el Maurin, Arturo Mena, Philippe Modard, Axel Moyal, Axel Nguyen Kerbel, Julien Revelle, Mats L. Richter, Mar\'ia Santos, Laurent Sifre, Maxime Theillard, Marc Thibault, Louis Thiry, L\'eo Tronchon, Nicolas Usunier, Tony Wu• 2025

Related benchmarks

TaskDatasetResultRank
GroundingScreenSpot v2
Accuracy89.9
32
Browser UseWebVoyager
Success Rate55.4
14
Visual GroundingScreenSpot
Accuracy87.4
6
Showing 3 of 3 rows

Other info

Follow for update