Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Honesty Evaluation on GPT-4o-mini responses

68.79Win Rate (GaaA)

GuardAdvisor

53.897257.763661.6365.4964Apr 8, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.04
68.7928.033.18
2026.04
64.0233.62.39
2026.04
54.4744.141.39