Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Code Generation on MBPP (pass@1, Precision(h), Recall(h), F1(h))
Loading...
34.2
Pass@1
SemStamp
14.232
19.416
24.6
29.784
Jan 14, 2026
Pass@1
Precision (h)
Recall (h)
F1 (h)
AUROC (h)
TP@1 (h)
TP@5 (h)
Updated 2d ago
Evaluation Results
Method
Method
Links
Pass@1
Precision (h)
Recall (h)
F1 (h)
AUROC (h)
TP@1 (h)
TP@5 (h)
SemStamp
2026.01
34.2
66.9
71.8
69.2
61.7
0.6
0.8
No Watermark
2026.01
33.8
-
-
-
-
-
-
SeqMark
2026.01
33.6
76
86
80.7
79.4
0
86
SWEET
2026.01
33.2
79.5
62.3
70.2
74.8
12.4
24
KGW
2026.01
15
87.5
31
45.7
71.2
8
17
Feedback
Search any
task
Search any
task