Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Specialized category

Benchmarks

Task NameDataset NameSOTA ResultTrend
Toxicity mitigation evaluationSpecialized category
RTR59.6
8
Toxicity MitigationSpecialized category Manually-designed jailbreak attacks
RTR4.5
3
Toxicity MitigationSpecialized category Optimization-based jailbreak attacks
Metric-
0
Showing 3 of 3 rows