Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TaskBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Function SelectionTaskBench HuggingFace
Function Selection Accuracy77.1
45
Function SelectionTaskBench Multimedia
Function Selection Acc82.3
36
Function SelectionTaskBench DailyLife
Function Selection Accuracy96.8
36
Task PlanningTaskBench Daily Life
Node-F197.36
25
Task PlanningTaskBench Multimedia
Node F188.54
25
Tool Retrieval and Function SelectionTaskbench-HF
MRR0.75
18
Task PlanningTaskBench Multimedia v1 (test)
n-F188.54
14
Tool Retrieval and Function SelectionTaskbench DL
Function Selection Accuracy58.9
9
Tool Retrieval and Function SelectionTaskbench-MM
Function Selection Accuracy27
9
Showing 9 of 9 rows