Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Latent Knowledge Elicitation on Deceptive Alignment Benchmark (DAB) 400 scenarios
Loading...
81.2
Elicitation Accuracy
MechELK
59.984
65.492
71
76.508
Apr 7, 2026
Elicitation Accuracy
Updated 5d ago
Evaluation Results
Method
Method
Links
Elicitation Accuracy
MechELK
Model=Llama-8B
2026.04
81.2
MechELK
Model=Mistral-7B
2026.04
79.6
RepE
Model=Llama-8B
2026.04
70.2
SAE-Probe
Model=Llama-8B
2026.04
69.8
RepE
Model=Mistral-7B
2026.04
68.7
Act. Patching
Model=Llama-8B
2026.04
68.4
CCS
Model=Llama-8B
2026.04
67.3
SAE-Probe
Model=Mistral-7B
2026.04
67.3
Act. Patching
Model=Mistral-7B
2026.04
66.1
CCS
Model=Mistral-7B
2026.04
65.9
Direct Probing
Model=Llama-8B
2026.04
62.1
Direct Probing
Model=Mistral-7B
2026.04
60.8
Feedback
Search any
task
Search any
task