Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fine-grained Score Accuracy on MultiDialog

65.62Exact Accuracy

interpretable AI judge

62.33963.979565.6267.2605Feb 27, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
65.6275.6178.47