Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Red-teaming on Religious Discrimination principle v1 (test)
Loading...
5.32
Mean Best Category Score
QCI
-1.9912
-0.0931
1.805
3.7031
Feb 12, 2026
Mean Best Category Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean Best Category Score
QCI
Target Model=Llama-3.1...
2026.02
5.32
CRL
Target Model=Llama-3.1...
2026.02
4.84
QCI
Target Model=Qwen3-30B...
2026.02
4.14
QCI
Target Model=GPT-4.1-Mini
2026.02
2.82
QCI
Target Model=Gemma3-12...
2026.02
2.66
CRL
Target Model=GPT-4.1-Mini
2026.02
2.57
CRL
Target Model=Gemma3-12...
2026.02
2.26
CRL
Target Model=Qwen3-30B...
2026.02
2.22
RS
Target Model=Llama-3.1...
2026.02
-0.9
RS
Target Model=Qwen3-30B...
2026.02
-1.29
RS
Target Model=Gemma3-12...
2026.02
-1.62
RS
Target Model=GPT-4.1-Mini
2026.02
-1.71
Feedback
Search any
task
Search any
task