Share your thoughts, 1 month free Claude Pro on usSee more

Misaligned Task Learning on Security In-domain

2.1Misalignment

Persona Vectors

Updated 4mo ago

Evaluation Results

Method	Links
Persona Vectors 2025.08		2.1	3.6
KL 2025.08		6.33	0.67
Interleaving 2025.08		21.27	40.2
Interleaving++ 2025.08		22.03	40.07
Interleaving+ 2025.08		22.2	39.7
Misaligned 2025.08		25.33	36