Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Pairwise Comparison on LLMEval

0.5098Agreement

GPT-4

0.224320.2984350.372550.446665Nov 30, 2023
Updated 4d ago

Evaluation Results

MethodLinks
2023.11
0.50980.8471
2023.11
0.50720.8595
2023.11
0.48040.7902
2023.11
0.44770.7582
2023.11
0.42810.6961
2023.11
0.40070.6458
2023.11
0.40.685
2023.11
0.28560.517
2023.11
0.27580.5556
2023.11
0.23530.4327