Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sycophancy Evaluation on Early A/B tests Online prevalence

-0.69Prevalence Change (Free Users)

gpt-5-main

-0.7245-0.70725-0.69-0.67275Dec 19, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
-0.69-0.75