GPT-4o

Benchmarks

Task Name	Dataset Name	SOTA Result
Jailbreak Attack	GPT-4o API	ASR84	16
Targeted Adversarial Attack	GPT-4o	ASR860	12
Open-ended image description	GPT-4o assisted evaluation	Accuracy8.76	8
Detection of paraphrased text	GPT-4o-mini Paraphrased	ROC AUC (FPR=1%)0.4231	8
Jailbreak Attack	GPT-4o (test)	ASR95	6
Denial-of-Service Attack	GPT-4o-mini 2024-07-18 (test)	Response Length16,384	6
Jailbreak	GPT-4o 29 May 2025 release	ASR98.46	5
Policy Corruption Evaluation	GPT-4o mini	Compliance Score3.53	5
Rule Extraction	GPT-4o non-judge output abstractions	Number of Proxy Rules2,140	4
Rule Extraction	GPT-4o mini non-judge output abstractions	Number of Proxy Rules1,158	4
Keyword Matching Attack	GPT-4o	KMR (alpha)73	4
Sycophancy-Induced Spiral Dynamics Intervention	GPT-4o high-sycophancy deployment (n = 200, T = 30)	Spiral Rate16.5	3
Jailbreaking	GPT-4o efficiency analysis	Attack Success Rate (ASR)65.7	3

Showing 13 of 13 rows