Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Sycophancy Evaluation on Offline Evaluation Set

4Sycophancy Prevalence Score

gpt-5-thinking

3.586.4159.2512.085Dec 19, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
4
2025.12
5.2
2025.12
14.5