Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HB

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jailbreak Defense EvaluationHB
Strong-Reject Score (SR)3.065
21
Jailbreak DetectionHB
Correctness Rate (COR)100
13
Safety EvaluationHB Text
RSR99.33
12
Time Series ClassificationHB
Accuracy78.4
10
Showing 4 of 4 rows