Share your thoughts, 1 month free Claude Pro on usSee more

Human-Metric Correlation on SimpEval In-Distribution

0.321Kendall's Tau

AutoMetrics

Updated 5mo ago

Evaluation Results

Method	Links
AutoMetrics 2025.12		0.321
AutoMetrics 2025.12		0.316
LLM-Judge 2025.12		0.294
LLM-Judge 2025.12		0.272
Best Existing Metric 2025.12		0.246
DnA Eval 2025.12		0.234
MetaMetrics 2025.12		0.127
Finetuned LLM 2025.12		0.076
DnA Eval 2025.12		0.042