Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Unlearning Detection on WMDP
Loading...
100
Accuracy
LLaMA-3.1-8B + Activation-based Detection
90.1824
92.7312
95.28
97.8288
Jun 16, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
LLaMA-3.1-8B + Activation-based Detection
Model=LLaMA-3.1-8B, Un...
2025.06
100
Yi-34B + Activation-based Detection
Model=Yi-34B, Unlearni...
2025.06
100
Zephyr-7B
Unlearning Method=NPO,...
2025.06
100
LLaMA-3.1-8B
Unlearning Method=NPO,...
2025.06
100
LLaMA-3.1-8B
Unlearning method=NPO,...
2025.06
100
Qwen2.5-14B
Unlearning Method=NPO,...
2025.06
99.86
Yi-34B
Unlearning method=NPO,...
2025.06
99.86
Zephyr-7B + Activation-based Detection
Model=Zephyr-7B, Unlea...
2025.06
99.72
Yi-34B
Unlearning Method=NPO,...
2025.06
99.72
Zephyr-7B
Unlearning method=NPO,...
2025.06
99.72
Qwen2.5-14B
Unlearning method=NPO,...
2025.06
99.72
Qwen2.5-14B + Activation-based Detection
Model=Qwen2.5-14B, Unl...
2025.06
98.59
unlearning trace detection
Model=Qwen2.5-14B
2025.06
95.07
unlearning trace detection
Model=Yi-34B
2025.06
94.37
unlearning trace detection
Model=LLaMA-3.1-8B
2025.06
93.24
unlearning trace detection
Model=Zephyr-7B
2025.06
90.56
Feedback
Search any
task
Search any
task