Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions

About

Tool learning enables Large Language Models (LLMs) to interact with external environments by invoking tools, serving as an effective strategy to mitigate the limitations inherent in their pre-training data. In this process, tool documentation plays a crucial role by providing usage instructions for LLMs, thereby facilitating effective tool utilization. This paper concentrates on the critical challenge of bridging the comprehension gap between LLMs and external tools due to the inadequacies and inaccuracies inherent in existing human-centric tool documentation. We propose a novel framework, DRAFT, aimed at Dynamically Refining tool documentation through the Analysis of Feedback and Trials emanating from LLMs' interactions with external tools. This methodology pivots on an innovative trial-and-error approach, consisting of three distinct learning phases: experience gathering, learning from experience, and documentation rewriting, to iteratively enhance the tool documentation. This process is further optimized by implementing a diversity-promoting exploration strategy to ensure explorative diversity and a tool-adaptive termination mechanism to prevent overfitting while enhancing efficiency. Extensive experiments on multiple datasets demonstrate that DRAFT's iterative, feedback-based refinement significantly ameliorates documentation quality, fostering a deeper comprehension and more effective utilization of tools by LLMs. Notably, our analysis reveals that the tool documentation refined via our approach demonstrates robust cross-model generalization capabilities.

Changle Qu, Sunhao Dai, Xiaochi Wei, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Jun Xu, Ji-Rong Wen• 2024

Related benchmarks

TaskDatasetResultRank
Tool UseToolBench
Average Pass Rate56.43
29
Tool UseStableToolBench
I2 Category Success68.5
28
Question AnsweringHotpotQA
F1 Score57.71
15
Tool UseStableToolBench G1 Category
SL73.2
12
Tool UseStableToolBench G3 Instruction
SL Score63.2
6
Tool UseStableToolBench G2 Instruction
SL Score68.2
6
Tool UseStableToolBench Overall Average
SL (Success Rate)68.1
6
Tool UseStableToolBench G1 Instruction
SL Score70
6
Tool UseStableToolBench G2 Category
SL66.3
6
Tool ExecutionTrace-based setting
Improvement (%)11.1
4
Showing 10 of 11 rows

Other info

Follow for update