Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sycophancy

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hypocritical Sycophancy DetectionSycophancy dataset knows-truth
AUROC74
9
Sycophancy MitigationSycophancy
Sycophancy36.2
8
Deception EvaluationSycophancy
CoT Plan Accuracy89.77
6
Concept RecoverySYCOPHANCY
Mean MCC0.9827
6
Concept IdentifiabilitySycophancy
MCC1
6
Showing 5 of 5 rows