Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
NLG Evaluation on USR
Loading...
0.82
Spearman Correlation
GPT-4o
0.768
0.7815
0.795
0.8085
Feb 9, 2026
Spearman Correlation
Updated 4d ago
Evaluation Results
Method
Method
Links
Spearman Correlation
GPT-4o
Prompting=Few-shot
2026.02
0.82
GPT-4o
Prompting=Zero-shot
2026.02
0.81
Qwen
Prompting=Zero-shot
2026.02
0.81
Qwen
Prompting=Few-shot
2026.02
0.8
GPT-4o-mini
Prompting=Zero-shot
2026.02
0.79
Llama
Prompting=Few-shot
2026.02
0.79
Mixtral
Prompting=Zero-shot
2026.02
0.79
Mixtral
Prompting=Few-shot
2026.02
0.79
GPT-4o-mini
Prompting=Few-shot
2026.02
0.78
Llama
Prompting=Zero-shot
2026.02
0.77
Feedback
Search any
task
Search any
task