Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following Evaluation on SELF-INSTRUCT seed data
Loading...
72.01
Score
GPT-4 Turbo
69.982
70.5085
71.035
71.5615
Sep 27, 2024
Score
Mean
Variance
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
Mean
Variance
GPT-4 Turbo
evaluation_scale=perce...
2024.09
72.01
-
-
GLM-4
evaluation_scale=perce...
2024.09
71.86
-
-
Claude3
evaluation_scale=perce...
2024.09
71.71
-
-
Qwen
evaluation_scale=perce...
2024.09
71.11
-
-
GPT-4
evaluation_scale=perce...
2024.09
70.06
-
-
Aggregate Statistics
2024.09
-
71.35
0.51
Feedback
Search any
task
Search any
task