Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Sycophancy detection on SYCON-Bench raw n=100
Loading...
68
Sycophancy Rate
Gemini 3 Flash
-0.64
17.18
35
52.82
Apr 8, 2026
Sycophancy Rate
P_self (Self-Estimated Probability)
Updated 7d ago
Evaluation Results
Method
Method
Links
Sycophancy Rate
P_self (Self-Estimated Probability)
Gemini 3 Flash
Family=Gemini, Organiz...
2026.04
68
0.97
Llama 4 Maverick†
Family=Llama, Organiza...
2026.04
58
0.76
GPT-5.2
Family=GPT, Organizati...
2026.04
53
0.72
Gemini 3 Pro
Family=Gemini, Organiz...
2026.04
53
0.99
GPT-4.1
Family=GPT, Organizati...
2026.04
48
0.83
DeepSeek V3
Organization=DeepSeek
2026.04
47
0.74
GPT-4o
Family=GPT, Organizati...
2026.04
34
0.65
Llama 4 Scout
Family=Llama, Organiza...
2026.04
16
0.9
Sonnet 4.5
Family=Claude, Organiz...
2026.04
8
0.88
Opus 4.6
Family=Claude, Organiz...
2026.04
2
0.97
Feedback
Search any
task
Search any
task