Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sycophancy Evaluation on PHIL
Loading...
99.34
Sycophancy Preference
Supervised Pinpoint Tuning
48.1824
61.4637
74.745
88.0263
Jan 26, 2026
Sycophancy Preference
Updated 4d ago
Evaluation Results
Method
Method
Links
Sycophancy Preference
Supervised Pinpoint Tuning
Model=Gemma-2-9B
2026.01
99.34
Synthetic Data Intervention
Model=Gemma-2-9B
2026.01
98.73
Untrained Gemma-2-9B
Model=Gemma-2-9B
2026.01
98.71
Supervised Pinpoint Tuning
Base Model=Gemma-2-2B
2026.01
90.41
Untrained Gemma-2-2B
Base Model=Gemma-2-2B
2026.01
90.35
Synthetic Data Intervention
Base Model=Gemma-2-2B
2026.01
79.65
Ours Resid
Model=Gemma-2-9B, Prob...
2026.01
69.56
Ours SAE
Model=Gemma-2-9B, Prob...
2026.01
60.81
Ours Resid
Base Model=Gemma-2-2B,...
2026.01
53.98
Ours SAE
Base Model=Gemma-2-2B,...
2026.01
50.15
Feedback
Search any
task
Search any
task