Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dialogue Generation on Anthropic-HH (test)

69.07Average Preference Score

Cal-DPO

42.997249.766156.53563.3039Dec 19, 2024Feb 26, 2025May 7, 2025Jul 15, 2025Sep 23, 2025Dec 1, 2025Feb 9, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2024.12
69.0773.5264.61
2026.02
68.72--
2026.02
68.52--
2026.02
68.02--
2024.12
67.8672.363.39
2026.02
67.05--
2024.12
64.3469.3259.35
2024.12
63.3468.7757.91
2024.12
61.6265.5257.71
2024.12
60.3165.3455.28
2024.12
59.1663.1955.12
2026.02
58.99--
2024.12
56.360.2152.38
2026.02
55.3--
2026.02
49.48--
2026.02
44--