Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Alignment on WildGuardMix
Loading...
55
Win Rate
Chain-of-Thought
35.24
40.37
45.5
50.63
Jun 26, 2025
Win Rate
Tie Rate
Loss Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
Tie Rate
Loss Rate
Chain-of-Thought
Main Model=GPT-4o-mini...
2025.06
55
4
41
Best-of-N
Main Model=GPT-4o-mini...
2025.06
52
9
39
Multi-Agent Debate
Main Model=GPT-4o-mini...
2025.06
50
11
39
RLAIF
training_context=after...
2025.06
42
41
17
Self-Refine (Debate)
training_context=after...
2025.06
36
43
21
Feedback
Search any
task
Search any
task