Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Sycophancy Evaluation on Sycophancy Evaluation
Loading...
13.3
BRR
Baseline
0.716
3.983
7.25
10.517
May 20, 2026
BRR
Sycophancy Rate
Non-Sycophancy Rate
Invariance Rate
Updated 12d ago
Evaluation Results
Method
Method
Links
BRR
Sycophancy Rate
Non-Sycophancy Rate
Invariance Rate
Baseline
Model=Nemotron-30B
2026.05
13.3
18.8
81.2
78.4
Baseline
Model=Qwen3-8B
2026.05
10.3
16.5
83.5
84.3
SFT
Model=Qwen3-8B
2026.05
6.6
12.9
87.1
84.6
SFT
Model=Nemotron-30B
2026.05
6.6
12
88
86
Baseline
Model=gpt-oss-20b
2026.05
5.6
10.9
89.1
86.2
OPCT
Model=Nemotron-30B
2026.05
3.5
8.9
91.1
86.7
SFT
Model=gpt-oss-20b
2026.05
3.3
8.7
91.3
87.6
OPCT
Model=Qwen3-8B
2026.05
2.5
8.6
91.4
89.5
OPCT
Model=gpt-oss-20b
2026.05
1.2
6.7
93.3
90.3
Feedback
Search any
task
Search any
task