Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

About

Large Language Models (LLMs) can enhance their reasoning capabilities by using external tools. However, many tasks lack predefined tools. Prior works have explored instructing LLMs to generate tools on their own, but such approaches depend heavily on internal knowledge and struggle when tasks fall outside the model's knowledge scope. To address this limitation, we propose RefTool, a reference-guided framework for automatic tool creation that leverages external materials, such as textbooks and knowledge snippets. RefTool consists of two modules: (1) tool creation, where LLMs generate executable tools from reference content, validate them using illustrative examples, and organize them hierarchically into a toolbox; and (2) tool utilization, where LLMs navigate the toolbox structure to select and apply the appropriate tools to solve problems. Experiments on causality, physics, and chemistry benchmarks demonstrate that RefTool outperforms existing tool-creation and domain-specific reasoning methods by 12.3% on average accuracy, while being cost-efficient and broadly generalizable to non-scientific tasks, e.g., extremely low-resource language translation. Analyses reveal that grounding tool creation in references produces accurate and faithful tools, and that the hierarchical structure facilitates effective tool selection. RefTool enables LLMs to overcome internal knowledge limitations, advancing generalizable reasoning in knowledge-intensive domains.

Xiao Liu, Da Yin, Zirui Wu, Yansong Feng• 2025

Related benchmarks

TaskDatasetResultRank
CausalityQRData
Accuracy52
36
ChemistrySciBench
Accuracy66.4
32
PhysicsTheoremQA
Accuracy58.8
28
Physics Problem SolvingScibench fund
Accuracy74.6
24
Machine TranslationZHUANGRULES Zhuang → Chinese
BLEU57.2
10
Machine TranslationZHUANGRULES Chinese → Zhuang
BLEU54.2
10
Showing 6 of 6 rows

Other info

Follow for update