Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Sycophancy Evaluation on Open-Ended Sycophancy
Loading...
48.15
Syc Score
Synthetic Data Intervention
28.8892
33.8896
38.89
43.8904
Jan 26, 2026
Syc Score
Non-Syc Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Syc Score
Non-Syc Score
Synthetic Data Intervention
Base Model=Gemma-2-2B
2026.01
48.15
50
Ours SAE
Model=Gemma-2-9B, Prob...
2026.01
44.44
69.23
Ours Resid
Base Model=Gemma-2-2B,...
2026.01
44.44
53.85
Synthetic Data Intervention
Model=Gemma-2-9B
2026.01
40.74
46.15
Untrained Gemma-2-2B
Base Model=Gemma-2-2B
2026.01
37.04
69.23
Supervised Pinpoint Tuning
Base Model=Gemma-2-2B
2026.01
37.04
69.23
Ours SAE
Base Model=Gemma-2-2B,...
2026.01
37.04
61.54
Untrained Gemma-2-9B
Model=Gemma-2-9B
2026.01
33.33
69.23
Supervised Pinpoint Tuning
Model=Gemma-2-9B
2026.01
33.33
69.23
Ours Resid
Model=Gemma-2-9B, Prob...
2026.01
29.63
69.23
Feedback
Search any
task
Search any
task