Share your thoughts, 1 month free Claude Pro on usSee more

Tool Learning on StableToolBench I2-Cat.

71.9SoPR

DTA-Llama

Updated 4mo ago

Evaluation Results

Method	Links
DTA-Llama 2025.01		71.9	65.3
GPT-4 (Parallel) 2025.01		70.8	77.4
GPT-3.5 (DFSDT) 2025.01		69.8	68.5
GPT-4 (DFSDT) 2025.01		68	62.9
GPT-3.5 (Parallel) 2025.01		61.4	53.2
Qwen2.5 (Parallel) 2025.01		61.3	61.3
ToolLLAMA (DFSDT) 2025.01		53.4	49.2
GPT-4 (ReAct) 2025.01		48.9	62.1
GPT-3.5 (ReAct) 2025.01		43.9	-
ToolLLAMA (ReAct) 2025.01		40.9	38.7
ToolLLaMA† (DFSDT) 2025.01		39.1	39.5
LLMCompiler 2025.01		38.4	38.1
ToolLLaMA† (ReAct) 2025.01		24.5	28.2