Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Response Generation on WildGuardMix
Loading...
61
Win Count
Chain-of-Thought
51.64
54.07
56.5
58.93
Jun 26, 2025
Win Count
Tie Count
Lose Count
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Count
Tie Count
Lose Count
Chain-of-Thought
Main Model=Vicuna, Inf...
2025.06
61
11
28
Multi-Agent Debate
Main Model=Vicuna, Inf...
2025.06
54
21
25
Best-of-N
Main Model=Vicuna, Inf...
2025.06
52
17
31
Feedback
Search any
task
Search any
task