Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Faithfulness evaluation on TellMeWhy

0.368AUC π-Soft-NS

Grad-ELLM

0.153760.209380.2650.32062Jan 6, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.01
0.3680.531
2026.01
0.3680.531
2026.01
0.3350.564
2026.01
0.3350.564
2026.01
0.3131.034
2026.01
0.3090.796
2026.01
0.3081.322
2026.01
0.3080.462
2026.01
0.3080.46
2026.01
0.3080.462
2026.01
0.3080.46
2026.01
0.3070.467
2026.01
0.3070.467
2026.01
0.3040.465
2026.01
0.3040.465
2026.01
0.2990.745
2026.01
0.2990.758
2026.01
0.2861.356
2026.01
0.2830.796
2026.01
0.2570.386
2026.01
0.2570.386
2026.01
0.2470.426
2026.01
0.2470.426
2026.01
0.2250.684
2026.01
0.190.389
2026.01
0.190.389
2026.01
0.1620.398