Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Robustness Evaluation on GPT-4o-mini responses
Loading...
63.11
GaaA Win Rate
GuardAdvisor
38.5348
44.9149
51.295
57.6751
Apr 8, 2026
GaaA Win Rate
Original Win Rate
Tie Rate
Updated 9d ago
Evaluation Results
Method
Method
Links
GaaA Win Rate
Original Win Rate
Tie Rate
GuardAdvisor
Guardian=GuardAdvisor
2026.04
63.11
34.29
2.59
GPT-4o-mini
Guardian=GPT-4o-mini
2026.04
46.11
52.16
1.73
GPT-4o
Guardian=GPT-4o
2026.04
39.48
59.08
1.44
Feedback
Search any
task
Search any
task