Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reference-free Conversation Evaluation on CRSArena-Eval RD

0.712Pearson Correlation (r)

FACE

0.076560.241530.40650.57147May 30, 2025
Updated 11d ago

Evaluation Results

MethodLinks
2025.05
0.7120.668
2025.05
0.650.635
2025.05
0.6170.555
2025.05
0.6050.574
2025.05
0.5770.577
2025.05
0.570.453
2025.05
0.5640.522
2025.05
0.5510.53
2025.05
0.5490.55
2025.05
0.5220.482
2025.05
0.4980.472
2025.05
0.490.471
2025.05
0.490.444
2025.05
0.4880.482
2025.05
0.4840.534
2025.05
0.4820.392
2025.05
0.4810.457
2025.05
0.4680.462
2025.05
0.4640.455
2025.05
0.4530.446
2025.05
0.4470.43
2025.05
0.4430.437
2025.05
0.4330.422
2025.05
0.4320.422
2025.05
0.4250.4
2025.05
0.4050.363
2025.05
0.3950.387
2025.05
0.360.341
2025.05
0.3510.364
2025.05
0.3420.329
2025.05
0.3390.423
2025.05
0.3320.325
2025.05
0.3110.288
2025.05
0.3020.289
2025.05
0.2790.29
2025.05
0.2480.26
2025.05
0.2460.225
2025.05
0.2350.255
2025.05
0.2170.203
2025.05
0.1880.174
2025.05
0.1820.242
2025.05
0.1750.177
2025.05
0.1740.174
2025.05
0.1010.101