Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-agent Planning and Composition on TaskBench Hugging Face APIs (test)
Loading...
74.97
Precision
AutoMAS
71.2215
73.09575
74.97
76.84425
May 5, 2026
Precision
Recall
F1-Score
Task-ArgName F1
Edit Distance
Sequence Similarity
Type Accuracy
N-Tools Accuracy
Updated 28d ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1-Score
Task-ArgName F1
Edit Distance
Sequence Similarity
Type Accuracy
N-Tools Accuracy
AutoMAS
Strategy=ReAct Style,...
2026.05
74.97
63.44
68.72
35.47
0.1859
81.68
82.77
70.89
Feedback
Search any
task
Search any
task