Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Evaluation on Crescendo
Loading...
12
ASR
Claude 3.5 Sonnet
10.22
22.235
34.25
46.265
Feb 28, 2025
ASR
Updated 4d ago
Evaluation Results
Method
Method
Links
ASR
Claude 3.5 Sonnet
NBF steering=true
2025.02
12
Claude 3.5 Sonnet
NBF steering=false
2025.02
21.5
GPT-3.5-turbo
NBF steering=true
2025.02
23.5
GPT-4o
NBF steering=true
2025.02
26
o1
NBF steering=true
2025.02
28
o1
NBF steering=false
2025.02
44.5
GPT-3.5-turbo
NBF steering=false
2025.02
56
GPT-4o
NBF steering=false
2025.02
56.5
Feedback
Search any
task
Search any
task