Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM-as-a-Judge on BigGen-Bench (test)
Loading...
0.312
Pearson Correlation
LLaDA (FS)
-0.01976
0.06637
0.1525
0.23863
Apr 4, 2026
Pearson Correlation
Spearman Correlation
Kendall Correlation
Perplexity (PPL)
Updated 11d ago
Evaluation Results
Method
Method
Links
Pearson Correlation
Spearman Correlation
Kendall Correlation
Perplexity (PPL)
LLaDA (FS)
Train=FS, Prompt Templ...
2026.04
0.312
0.548
0.263
3.73
LLaDA (FS+RO)
Train=FS+RO, Prompt Te...
2026.04
0.259
0.525
0.21
3.78
LLaDA (RO)
Train=RO, Prompt Templ...
2026.04
0.205
0.406
0.168
7.14
LLaDA (Public)
Train=Public, Prompt T...
2026.04
-0.007
0.078
0.068
7.4
Feedback
Search any
task
Search any
task