Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

About

Agentic task-solving with Large Language Models (LLMs) requires multi-turn, multi-step interactions, often involving complex function calls and dynamic user-agent exchanges. Existing simulation-based data generation methods for such scenarios rely heavily on costly autoregressive interactions between multiple LLM agents, thereby compromising the practical efficiency of agentic data generation. In this paper, we propose ToolACE-MT, a novel Non-Autoregressive Iterative Generation framework for constructing high-quality multi-turn agentic dialogues. ToolACE-MT generates full conversational trajectories through three stages: coarse-grained initialization, iterative refinement, and offline verification. The initialization phase builds a structurally complete yet semantically coarse dialogue skeleton; the iterative refinement phase introduces realistic complexities and continued refinement via mask-and-fill operations; and the offline verification phase ensures correctness and coherence via rule- and model-based checks. Experiments demonstrate that ToolACE-MT enables efficient, effective and generalizable agentic data generation, offering a new paradigm for high-quality data construction in tool-augmented LLM scenarios.

Xingshan Zeng, Weiwen Liu, Lingzhi Wang, Liangyou Li, Fei Mi, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu• 2025

Related benchmarks

TaskDatasetResultRank
Function CallingBFCL V3
Overall Accuracy67.63
88
Tool-augmented ReasoningBFCL Multi-Turn v3
Overall Score40.3
14
Function CallingBFCL v3 2025-08-26 (test)
Multi-Turn Overall Accuracy40.25
9
Agentic Dialogueτ-Bench (test)
Retail Accuracy25.2
7
Agentic PerformanceACEBench-en
End-to-End Accuracy8.4
7
Multi-turn dialogueACEBench-en
MT Accuracy51
7
Agentic Function CallingBFCL v4
Web Search Accuracy8.5
6
Multi-Turn Tool Callingτ2-bench
Overall Score13.56
5
Showing 8 of 8 rows

Other info

Follow for update