Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Correlation Metrics on MT-Bench (LLM-as-a-Judge)

0.672Pearson's r

TRACT

0.263280.369390.47550.58161Mar 6, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.03
0.6720.639
2025.03
0.6180.614
2025.03
0.5550.529
2025.03
0.5470.583
2025.03
0.5410.556
2025.03
0.5190.483
2025.03
0.5170.503
2025.03
0.5110.506
2025.03
0.4830.469
2025.03
0.480.482
2025.03
0.4660.494
2025.03
0.4350.426
2025.03
0.4320.421
2025.03
0.3990.418
2025.03
0.3090.216
2025.03
0.2790.268