Share your thoughts, 1 month free Claude Pro on usSee more

Tool Use Accuracy on Seen Tools

100SRt

ChatHuman

Updated 5mo ago

Evaluation Results

Method	Links
ChatHuman 2024.05		100	97.3	95.1	96.6	97.4
ChatHuman 2024.05		100	97.4	95	97	97.5
Ours w/ GPT-4 2024.05		95.3	92	73.2	75.1	87.5
Visual ChatGPT-4 2024.05		89.2	80.2	71.5	75.3	79.7
GPT4Tools-FT 2024.05		82.5	71	68.7	69	74.1
GPT4Tools 2024.05		60.9	54.7	52.5	52	56.6
Visual ChatGPT-3.5 2024.05		49.8	31.9	23.7	25.1	79.1