Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sycophancy Evaluation on AITA

0.54Sycophancy Score (S) PD-L

Mistral

0.121920.230460.3390.44754Apr 2, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.04
0.540.6760.6960.4360.5220.5710.2650.2650.5080.5080.6370.6370.3560.3560.5180.5180.5630.563
2026.04
0.3990.6370.5270.3510.4250.2790.3380.3380.5840.5840.5320.5320.1970.1970.3090.3090.1990.199
2026.04
0.3620.4190.5450.2010.3350.2640.1690.1690.3650.3650.4920.4920.1880.1880.2790.2790.3750.375
2026.04
0.1640.2080.2170.1160.2170.10.1350.1350.2370.2370.5050.5050.0790.0790.1290.1290.0030.003
2026.04
0.1460.2120.1290.0960.1760.0990.1520.1520.2340.2340.3360.3360.0380.0380.0120.0120.0710.071
2026.04
0.1380.1990.1630.1140.130.0930.1050.1050.2060.2060.3990.3990.0220.0220.0390.039-0.015-0.015