Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Average across benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
LLM RoutingAverage across Benchmarks (val)
Avg Top-1 Acc83
14
Tool CallingAverage across 5 benchmarks
F1 (Name)88.47
9
Showing 2 of 2 rows