Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Honesty Evaluation on GPT-4o-mini responses
Loading...
68.79
Win Rate (GaaA)
GuardAdvisor
53.8972
57.7636
61.63
65.4964
Apr 8, 2026
Win Rate (GaaA)
Win Rate (Original)
Tie Rate
Updated 9d ago
Evaluation Results
Method
Method
Links
Win Rate (GaaA)
Win Rate (Original)
Tie Rate
GuardAdvisor
Guardian=GuardAdvisor
2026.04
68.79
28.03
3.18
GPT-4o-mini
Guardian=GPT-4o-mini
2026.04
64.02
33.6
2.39
GPT-4o
Guardian=GPT-4o
2026.04
54.47
44.14
1.39
Feedback
Search any
task
Search any
task