Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Role-playing Evaluation on RoleChat (val)

0.21Overall MSE

RoleJudge

0.10080.83791.5752.3121Apr 15, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
0.210.810.62
0.680.680.59
1.420.510.46
2026.04
1.860.440.38
2.940.350.26