Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Summarization on PyT
Loading...
100
Recall
Fixed
-1.764
24.6555
51.075
77.4945
Feb 11, 2026
Recall
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Recall
F1 Score
Fixed
Victim Model=CodeT5, D...
2026.02
100
37.84
Grammar
Victim Model=CodeT5, D...
2026.02
100
36.92
AFRAIDOOR
Victim Model=CodeT5, D...
2026.02
26.15
12.85
Fixed
Victim Model=CodeT5, D...
2026.02
25.34
15.23
Fixed
Victim Model=CodeT5, D...
2026.02
21.67
12.89
Grammar
Victim Model=CodeT5, D...
2026.02
21.45
12.78
STAB
Victim Model=CodeT5, D...
2026.02
20.73
9.45
Grammar
Victim Model=CodeT5, D...
2026.02
17.89
10.67
AFRAIDOOR
Victim Model=CodeT5, D...
2026.02
5.78
3.45
AFRAIDOOR
Victim Model=CodeT5, D...
2026.02
4.23
2.78
STAB
Victim Model=CodeT5, D...
2026.02
2.89
1.87
STAB
Victim Model=CodeT5, D...
2026.02
2.15
1.45
Feedback
Search any
task
Search any
task