Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HRL/LRL Safety Prompts

Benchmarks

Task NameDataset NameSOTA ResultTrend
Attack Success Rate EvaluationHRL/LRL Safety Prompts Tamil Multi-Image v1
ASR0
6
Attack Success Rate EvaluationHRL/LRL Safety Prompts Welsh Multi-Image v1
ASR0
6
Attack Success Rate EvaluationHRL/LRL Safety Prompts English, Single Image v1
ASR0
6
Attack Success Rate EvaluationHRL LRL Safety Prompts Tamil Single Image v1
ASR0
6
Attack Success Rate EvaluationHRL/LRL Safety Prompts Welsh Single Image v1
ASR6
6
Attack Success Rate EvaluationHRL LRL Safety Prompts Tamil Text v1
Attack Success Rate0
6
Attack Success Rate EvaluationHRL/LRL Safety Prompts Welsh Text v1
Attack Success Rate0
6
Showing 7 of 7 rows