Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Faithfulness Evaluation on SST2

0.563AUC π-Soft (NS)

Grad-ELLM

0.236440.321220.4060.49078Jan 6, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.01
0.5630.304
2026.01
0.5490.318
2026.01
0.5260.276
2026.01
0.5030.253
2026.01
0.4930.255
2026.01
0.4930.258
2026.01
0.4930.258
2026.01
0.4130.347
2026.01
0.4130.347
2026.01
0.3960.464
2026.01
0.3960.464
2026.01
0.3610.165
2026.01
0.3570.366
2026.01
0.3570.366
2026.01
0.3370.56
2026.01
0.3370.56
2026.01
0.3250.136
2026.01
0.3040.2
2026.01
0.3040.2
2026.01
0.30.278
2026.01
0.30.278
2026.01
0.2970.277
2026.01
0.2970.275
2026.01
0.2970.277
2026.01
0.2970.275
2026.01
0.2490.099
2026.01
0.2490.099