Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool usage in multi-turn dialogue on ChatHuman 1.0 (test)
Loading...
100
Success Rate (Args)
ChatHuman
41.552
56.726
71.9
87.074
May 7, 2024
Success Rate (Args)
Success Rate
IoU
Success Rate (Turn)
Success Rate (Action)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate (Args)
Success Rate
IoU
Success Rate (Turn)
Success Rate (Action)
ChatHuman
2024.05
100
95.9
92.7
95.5
96.2
Visual ChatGPT-4
Base Model=GPT-4
2024.05
86
79.4
71.1
74.4
78.9
GPT4Tools
2024.05
58.2
55.1
55.3
51.3
61.2
Visual ChatGPT-3.5
Base Model=GPT-3.5
2024.05
43.8
20.3
16.2
17.3
69.1
Feedback
Search any
task
Search any
task