Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Emergent Misalignment Measurement on Code
Loading...
0
Misalignment
KL
-0.1672
0.9614
2.09
3.2186
Aug 8, 2025
Misalignment
Incoherence
Updated 1mo ago
Evaluation Results
Method
Method
Links
Misalignment
Incoherence
KL
Model=Qwen2.5-32B
2025.08
0
0
Persona Vectors
Model=Qwen2.5-32B
2025.08
0.04
0.12
Interleaving+
Model=Qwen2.5-32B
2025.08
0.21
8.27
Interleaving++
Model=Qwen2.5-32B
2025.08
0.33
5.25
Interleaving
Model=Qwen2.5-32B
2025.08
0.83
6.42
Misaligned
Model=Qwen2.5-32B
2025.08
4.18
9.21
Feedback
Search any
task
Search any
task