Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool-use Reasoning on ToolBench

88API Success Rate

ExpRAG

60.9667.987582.02Nov 25, 2025
Updated 15d ago

Evaluation Results

MethodLinks
2025.11
8872
2025.11
8771
2025.11
8773
2025.11
8571
2025.11
8266
2025.11
7866
2025.11
7763
2025.11
7763
2025.11
7763
2025.11
7762
2025.11
7762
2025.11
7662
2025.11
7661
2025.11
7661
2025.11
7662
2025.11
7661
2025.11
7662
2025.11
7260
2025.11
7261
2025.11
7161
2025.11
7161
2025.11
7161
2025.11
7159
2025.11
7059
2025.11
6857
2025.11
6857
2025.11
6457
2025.11
6254