Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sycophancy Detection on Sycophancy benchmark (full evaluation set)

0.732AUROC

Hypocrisy Gap

0.408560.492530.57650.66047Jan 14, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
0.732
2026.01
0.731
2026.01
0.588
2026.01
0.587
2026.01
0.549
2026.01
0.549
2026.01
0.5
2026.01
0.499
2026.01
0.453
2026.01
0.452
2026.01
0.424
2026.01
0.421