Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Backdoor Evaluation
Loading...
-73
Insecurity Impact (%)
AdL*
-76
-55.75
-35.5
-15.25
Apr 12, 2026
Insecurity Impact (%)
Code CE Change
MMLU Delta Accuracy (%)
Updated 5d ago
Evaluation Results
Method
Method
Links
Insecurity Impact (%)
Code CE Change
MMLU Delta Accuracy (%)
AdL*
Backbone=Gemma
2026.04
-73
0
-
AdL*
Backbone=LLaMA
2026.04
-63.2
-0.2
-
AdLIRA*
2026.04
-45.1
0
-5.3
CB
Backbone=Gemma
2026.04
-11.6
0
-
CB
Backbone=LLaMA
2026.04
-8.8
0.1
-
CB
2026.04
-5.3
0
-17.1
LIRA*
Backbone=Gemma
2026.04
-4.9
0
-
GD
Backbone=LLaMA
2026.04
-3.8
0.2
-
GD
Backbone=Gemma
2026.04
0.9
22.6
-
GD
2026.04
2
17.8
-2.1
Feedback
Search any
task
Search any
task