Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Human Evaluation on 200 human-generated instructions
Loading...
0.865
Success Rate
Olympus
0.64972
0.70561
0.7615
0.81739
Dec 12, 2024
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Olympus
2024.12
0.865
HuggingGPT
Backbone=GPT-4o
2024.12
0.752
HuggingGPT
Backbone=GPT-4o mini
2024.12
0.658
Feedback
Search any
task
Search any
task