Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Tool Use on Simulated 120-tool benchmark 500 tasks 1.0

480Tokens per Turn

B4 CLI Lazy

-1,393.2811,251.3623,89636,540.64Apr 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
4800.94882.45.40.03
2,3680.919424.30.03
4,0820.78812.24.60.04
11,8650.56583.87.10.09
47,3120.24724.27.90.21