Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool-use performance on Tau-bench Retail (test)
Loading...
66
Pass Rate
gpt-4o-FC-JOSH-SFT
14.9048
28.1699
41.435
54.7001
Sep 6, 2024
Pass Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass Rate
gpt-4o-FC-JOSH-SFT
Backbone=gpt-4o, Promp...
2024.09
66
gpt-4o-FC
Backbone=gpt-4o, Promp...
2024.09
65.21
gpt-4o-ACT-JOSH-SFT
Backbone=gpt-4o, Promp...
2024.09
64.26
gpt-4o-ACT
Backbone=gpt-4o, Promp...
2024.09
63.13
gpt-4o-ReACT-JOSH-SFT
Backbone=gpt-4o, Promp...
2024.09
58.43
gpt-4o-mini-FC-JOSH-SFT
Backbone=gpt-4o-mini,...
2024.09
58.26
gpt-4o-ReACT
Backbone=gpt-4o, Promp...
2024.09
54.43
gpt-4o-mini-FC
Backbone=gpt-4o-mini,...
2024.09
50.78
gpt-4o-mini-ACT-JOSH-SFT
Backbone=gpt-4o-mini,...
2024.09
47.65
gpt-4o-mini-ACT
Backbone=gpt-4o-mini,...
2024.09
44.6
gpt-4o-mini-ReACT-JOSH-SFT
Backbone=gpt-4o-mini,...
2024.09
36.34
gpt-4o-mini-ReACT
Backbone=gpt-4o-mini,...
2024.09
16.87
Feedback
Search any
task
Search any
task