Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
NLG Meta-evaluation on CUS-QA orig. (sk)
Loading...
0.788
Kendall Correlation
Qwen 3 30B
0.65072
0.68636
0.722
0.75764
Mar 10, 2026
Kendall Correlation
Updated 1mo ago
Evaluation Results
Method
Method
Links
Kendall Correlation
Qwen 3 30B
Shot=Zero
2026.03
0.788
Llama 4 Scout
Shot=Few
2026.03
0.735
Llama 3.3 70B
Shot=Few
2026.03
0.73
Qwen 3 30B
Shot=Few
2026.03
0.709
Llama 4 Scout
Shot=Zero
2026.03
0.704
Llama 3.3 70B
Shot=Zero
2026.03
0.656
Feedback
Search any
task
Search any
task