Share your thoughts, 1 month free Claude Pro on usSee more

LLM-as-a-Judge on BigGen-Bench (test)

0.312Pearson Correlation

LLaDA (FS)

Updated 3mo ago

Evaluation Results

Method	Links
LLaDA (FS) 2026.04		0.312	0.548	0.263	3.73
LLaDA (FS+RO) 2026.04		0.259	0.525	0.21	3.78
LLaDA (RO) 2026.04		0.205	0.406	0.168	7.14
LLaDA (Public) 2026.04		-0.007	0.078	0.068	7.4