Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool usage in multi-turn dialogue on ChatHuman 1.0 (test)
Loading...
100
Success Rate (Args)
ChatHuman
41.552
56.726
71.9
87.074
May 7, 2024
Success Rate (Args)
Success Rate
IoU
Success Rate (Turn)
Success Rate (Action)
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate (Args)
Success Rate
IoU
Success Rate (Turn)
Success Rate (Action)
ChatHuman
2024.05
100
95.9
92.7
95.5
96.2
Visual ChatGPT-4
Base Model=GPT-4
2024.05
86
79.4
71.1
74.4
78.9
GPT4Tools
2024.05
58.2
55.1
55.3
51.3
61.2
Visual ChatGPT-3.5
Base Model=GPT-3.5
2024.05
43.8
20.3
16.2
17.3
69.1
Feedback
Search any
task
Search any
task