Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

EnterpriseLab: A Full-Stack Platform for developing and deploying agents in Enterprises

About

Deploying AI agents in enterprise environments requires balancing capability with data sovereignty and cost constraints. While small language models offer privacy-preserving alternatives to frontier models, their specialization is hindered by fragmented development pipelines that separate tool integration, data generation, and training. We introduce EnterpriseLab, a full-stack platform that unifies these stages into a closed-loop framework. EnterpriseLab provides (1) a modular environment exposing enterprise applications via Model Context Protocol, enabling seamless integration of proprietary and open-source tools; (2) automated trajectory synthesis that programmatically generates training data from environment schemas; and (3) integrated training pipelines with continuous evaluation. We validate the platform through EnterpriseArena, an instantiation with 15 applications and 140+ tools across IT, HR, sales, and engineering domains. Our results demonstrate that 8B-parameter models trained within EnterpriseLab match GPT-4o's performance on complex enterprise workflows while reducing inference costs by 8-10x, and remain robust across diverse enterprise benchmarks, including EnterpriseBench (+10%) and CRMArena (+10%). EnterpriseLab provides enterprises a practical path to deploying capable, privacy-preserving agents without compromising operational capability.

Ankush Agarwal, Harsh Vishwakarma, Suraj Nagaje, Chaitanya Devaguptapu• 2026

Related benchmarks

TaskDatasetResultRank
Enterprise task completionEnterpriseBench
Execution Score0.38
12
Agent ExecutionCRMArena (test)
Execution Accuracy35
8
Agent ExecutionEnterpriseBench (test)
Execution Accuracy51
8
Agent ExecutionEnterpriseArena (test)
Execution Accuracy43
8
Agent Executiontau-Bench (test)
Execution Accuracy42
8
Tool selectionEnterpriseArena
Tool Selection Accuracy28
8
Tool selectionEnterpriseBench
Tool Selection Accuracy21
8
Showing 7 of 7 rows

Other info

Follow for update