Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dialogue Evaluation on Amazon Topical-Chat

0.806Naturalness (Pearson r)

MILE-RefHumEval

0.538720.608110.67750.74689Feb 10, 2026
Updated 4d ago

Evaluation Results

MethodLinks
0.8060.750.8050.7390.7460.6830.8630.8130.6920.612
2026.02
0.5490.5650.5940.6050.6270.6310.5310.5510.5750.588