Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Utility Preservation on MMLU 5-shot (test)
Loading...
67.2
Utility Score
Benign (No Attack)
65.848
66.199
66.55
66.901
May 10, 2026
Utility Score
Updated 22d ago
Evaluation Results
Method
Method
Links
Utility Score
Benign (No Attack)
Method=Benign (No Atta...
2026.05
67.2
SFT-based
Method=SFT-based, Atta...
2026.05
67.1
SFT-based
Method=SFT-based, Atta...
2026.05
67.1
SFT-based
Method=SFT-based, Atta...
2026.05
67.1
VPI
Method=VPI, Attack Tar...
2026.05
67.1
VPI
Method=VPI, Attack Tar...
2026.05
67.1
BadDLM
Method=BadDLM (Ours),...
2026.05
67.1
BadDLM
Method=BadDLM (Ours),...
2026.05
67.1
BadDLM
Method=BadDLM (Ours),...
2026.05
67.1
SFT-based
Method=SFT-based, Atta...
2026.05
67
VPI
Method=VPI, Attack Tar...
2026.05
67
BadDLM
Method=BadDLM (Ours),...
2026.05
67
RL-based
Method=RL-based, Attac...
2026.05
66.3
RL-based
Method=RL-based, Attac...
2026.05
66.3
RL-based
Method=RL-based, Attac...
2026.05
65.9
RL-based
Method=RL-based, Attac...
2026.05
65.9
Feedback
Search any
task
Search any
task