Share your thoughts, 1 month free Claude Pro on usSee more

Sycophancy Evaluation

Benchmarks

Dataset Name	SOTA Method	Metric
Sycophancy Evaluation Dataset	mistral:7b	Total Sycophancy Score0.123	32	3mo ago
MIST Moral		Sycophancy1.6	24	1mo ago
MIST Science	Summarization	Sycophancy Rate4.7	24	1mo ago
Sycophancy Evaluation Opinion	Mistral-7B	PSS8.14	14	3mo ago
Sycophancy Evaluation Factual	Llama-3	PSS0.1124	14	3mo ago
Beacon benchmark		A/B Accuracy96	12	2mo ago
PHIL	Supervised Pinpoint Tuning	Sycophancy Preference99.34	10	4mo ago
POLI	Ours Resid	Sycophantic Preference (%)92.18	10	4mo ago
NLP	Synthetic Data Intervention	Sycophancy Preference49.25	10	4mo ago
Open-Ended Sycophancy	Synthetic Data Intervention	Syc Score48.15	10	4mo ago
Syco-Bench		Pickside Score1.21	10	5mo ago
Sycophancy Evaluation		BRR13.3	9	2mo ago
VISE	Gemini-1.5-Pro	Strong Bias58.04	9	2mo ago
SycophancyEval	Lag-DPO	Sycophancy Rate54.2	9	3mo ago
SycophancyEval Experimental runs 1 and 2 (train test combined)	BurGal	Correct Count106	8	1mo ago
Egotistical Validation (Control condition)		Mean Sycophancy2.41	8	1mo ago
DebateQA		S (PD, L)0.481	6	3mo ago
AITA		Sycophancy Score (S) PD-L0.54	6	3mo ago
Overall MMLU GSM8K Moral Reasoning Profundity (test)		Observed Praise Rate88.2	4	1mo ago
Sycophancy Evaluation Control condition		Control Mean2.64	3	1mo ago
Sycophancy Evaluation Suite All Guardrail Conditions		Mean Sycophancy Score1.83	3	1mo ago
VISE 1.0 (test)		Strong Bias64.84	3	2mo ago
TruthfulQA (adversarial)	Silicon Mirror	Sycophantic Response Count1	3	3mo ago
TruthfulQA Adversarial n=50	Gemini 2.5 Flash (Static Guardrails)	Sycophantic Responses Count2	3	3mo ago
Offline Evaluation Set	gpt-5-thinking	Sycophancy Prevalence Score4	3	4mo ago

Showing 25 of 26 rows