Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Emergent Misalignment Measurement on Legal
Loading...
0.58
Misalignment
Persona Vectors
-1.03
9.8375
20.705
31.5725
Aug 8, 2025
Misalignment
Incoherence
Updated 1mo ago
Evaluation Results
Method
Method
Links
Misalignment
Incoherence
Persona Vectors
Model=Qwen2.5-32B
2025.08
0.58
0.25
Interleaving
Model=Qwen2.5-32B
2025.08
3.21
7.59
Interleaving++
Model=Qwen2.5-32B
2025.08
4.01
6.64
KL
Model=Qwen2.5-32B
2025.08
5.83
0
Interleaving+
Model=Qwen2.5-32B
2025.08
13.77
13.03
Misaligned
Model=Qwen2.5-32B
2025.08
40.83
9.17
Feedback
Search any
task
Search any
task