Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Task Planning on HuggingGPT Human Evaluation Set 130 diverse requests (test)
Loading...
0.9122
Passing Rate
HuggingGPT
0.494328
0.602814
0.7113
0.819786
Mar 30, 2023
Passing Rate
Rationality
Updated 4d ago
Evaluation Results
Method
Method
Links
Passing Rate
Rationality
HuggingGPT
LLM=GPT-3.5
2023.03
0.9122
0.7847
HuggingGPT
LLM=Vicuna-13b
2023.03
0.7941
0.5841
HuggingGPT
LLM=Alpaca-13b
2023.03
0.5104
0.3217
Feedback
Search any
task
Search any
task