Share your thoughts, 1 month free Claude Pro on usSee more

SOTA LLM-as-Judge evaluation benchmarks and papers with code | Wizwand

Share your thoughts, 1 month free Claude Pro on usSee more

LLM-as-Judge evaluation

Benchmarks

Dataset Name	SOTA Method	Metric	Trend
HH dataset	RMOD	WCWR59.1		5	4mo ago
BookCorpus		Diversity Rank2.4		3	2mo ago

Showing 2 of 2 rows