Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Task Routing on OlympusBench chain-of-action setting
Loading...
18
Exact Difference (ED)
Olympus
16.92
24.21
31.5
38.79
Dec 12, 2024
Exact Difference (ED)
Precision
Recall
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Difference (ED)
Precision
Recall
F1 Score
Olympus
2024.12
18
91.82
92.75
91.98
HuggingGPT
backbone=GPT-4o
2024.12
35
75.03
60.23
61.25
HuggingGPT
backbone=GPT-4o mini
2024.12
45
65.14
48.51
53.14
Feedback
Search any
task
Search any
task