Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool-use Search Evaluation on SLATE synthetic
Loading...
68.5
Tool Match Rate
EGB-Sampling (Ours)
61.22
63.11
65
66.89
Apr 13, 2026
Tool Match Rate
Execution Success Rate
Action Identification Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Tool Match Rate
Execution Success Rate
Action Identification Accuracy
EGB-Sampling (Ours)
Model=Claude-Sonnet-4
2026.04
68.5
54
87.9
ReAct
Model=Claude-Sonnet-4
2026.04
66.4
29.3
85.7
Baseline-LLM
Model=Claude-Sonnet-4
2026.04
65.2
-
-
LATS
Model=Claude-Sonnet-4
2026.04
63.4
36.5
87.3
Reflexion
Model=Claude-Sonnet-4
2026.04
61.5
44.7
83.4
Feedback
Search any
task
Search any
task