Share your thoughts, 1 month free Claude Pro on usSee more

Agent Safety Evaluation on Agent-SafetyBench

72.3Agent-SafetyBench Score

gpt-4o + GBT-SE

Updated 2mo ago

Evaluation Results

Method	Links
gpt-4o + GBT-SE 2026.01		72.3	-	-
llama-3-8b + GBT-SE 2026.01		70.8	-	-
gpt-4o + GBT-Basic 2026.01		60.2	-	-
gpt-4o + Global guardrail only 2026.01		56.8	-	-
llama-3-8b + GBT-Basic 2026.01		55.4	-	-
llama-3-8b + Global guardrail only 2026.01		50.4	-	-
gpt-4o (native) 2026.01		44.2	-	-
llama-3-8b (native) 2026.01		20.1	-	-
NoDefense 2026.05		-	0	100
FIDES 2026.05		-	72	42.1
CaMeL 2026.05		-	78.3	28.5
PACT 2026.05		-	74.7	39.3