Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Generation on HumanEval (Attack/Defense Accuracy)
Loading...
100
Accuracy (Attack)
Reporting-and-penalty mechanism
44.88
59.19
73.5
87.81
Apr 26, 2026
Accuracy (Attack)
Accuracy (Defense Recovery)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Attack)
Accuracy (Defense Recovery)
Reporting-and-penalty mechanism
Attack Type=All Attack...
2026.04
100
100
GroupGuard
Attack Type=False Cons...
2026.04
67
78
GroupGuard
Attack Type=False Cons...
2026.04
59
100
GroupGuard
Attack Type=False Cons...
2026.04
47
81
Feedback
Search any
task
Search any
task