Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Policy Corruption Evaluation on Gemini-2-Flash
Loading...
3.65
Compliance
HPM
1.0084
1.6942
2.38
3.0658
Dec 20, 2025
Compliance
Trustfulness
Recklessness
Harm Principle Violation
Value System Drift
Self Doubt
Confusion
Updated 1mo ago
Evaluation Results
Method
Method
Links
Compliance
Trustfulness
Recklessness
Harm Principle Violation
Value System Drift
Self Doubt
Confusion
HPM
Victim Model=Gemini-2-...
2025.12
3.65
3.33
3.01
3.75
2.92
3.05
3.11
PAP
Victim Model=Gemini-2-...
2025.12
2.31
2.15
1.9
2.35
1.33
1.62
1.55
CoA
Victim Model=Gemini-2-...
2025.12
1.92
1.6
1.42
2.03
0.95
0.91
1.12
PAIR
Victim Model=Gemini-2-...
2025.12
1.3
1.12
0.91
1.15
0.72
0.5
0.63
AutoDAN
Victim Model=Gemini-2-...
2025.12
1.11
0.92
0.73
0.95
0.51
0.33
0.4
Feedback
Search any
task
Search any
task