Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agent Performance on AgentInstruct HELD-IN
Loading...
2.75
HELD-IN
GPT-4
0.0876
0.7788
1.47
2.1612
Mar 19, 2024
HELD-IN
Updated 4d ago
Evaluation Results
Method
Method
Links
HELD-IN
GPT-4
Provider=OpenAI, Year=...
2024.03
2.75
Agent-FLAN
2024.03
2.01
AgentLM-7B
2024.03
1.96
AgentTuning
Re-implementation=true
2024.03
1.89
GPT-3.5
Provider=OpenAI, Year=...
2024.03
1.59
Llama2-7B
2024.03
0.19
Feedback
Search any
task
Search any
task