Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Policy Corruption Evaluation on GPT-4o mini
Loading...
3.53
Compliance Score
HPM
0.93
1.605
2.28
2.955
Dec 20, 2025
Compliance Score
Trustfulness Score
Recklessness Score
Harm Violation Score
Value Drift Score
Self Doubt Score
Confusion Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Compliance Score
Trustfulness Score
Recklessness Score
Harm Violation Score
Value Drift Score
Self Doubt Score
Confusion Score
HPM
Victim Model=GPT-4o-mini
2025.12
3.53
3.21
2.95
3.6
2.81
2.9
3.04
PAP
Victim Model=GPT-4o-mini
2025.12
2.25
2.01
1.85
2.22
1.25
1.55
1.43
CoA
Victim Model=GPT-4o-mini
2025.12
1.81
1.53
1.3
1.92
0.88
0.81
1.05
PAIR
Victim Model=GPT-4o-mini
2025.12
1.22
1.01
0.85
1.03
0.61
0.42
0.55
AutoDAN
Victim Model=GPT-4o-mini
2025.12
1.03
0.84
0.62
0.81
0.45
0.22
0.31
Feedback
Search any
task
Search any
task