Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following Evaluation on Instruction Tuning with GPT-4
Loading...
71.29
Score
Claude3
67.4316
68.4333
69.435
70.4367
Sep 27, 2024
Score
Mean
Variance
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Mean
Variance
Claude3
evaluation_scale=perce...
2024.09
71.29
-
-
Qwen
evaluation_scale=perce...
2024.09
70.14
-
-
GLM-4
evaluation_scale=perce...
2024.09
69.89
-
-
GPT-4 Turbo
evaluation_scale=perce...
2024.09
69.25
-
-
GPT-4
evaluation_scale=perce...
2024.09
67.58
-
-
Aggregate Statistics
2024.09
-
69.63
1.49
Feedback
Search any
task
Search any
task