Share your thoughts, 1 month free Claude Pro on usSee more

SOTA Multi-judge evaluation benchmarks and papers with code | Wizwand

Share your thoughts, 1 month free Claude Pro on usSee more

Multi-judge evaluation

Benchmarks

Dataset Name	SOTA Method	Metric	Trend
Shared 500-prompt sample		Global Correlation (r)0.87		5	4mo ago

Showing 1 of 1 rows