Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Target behavior attack on HumanEval

98.3ASR

AiTM

-3.93222.60949.1575.691Feb 20, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.02
98.3
2025.02
97.6
2025.02
97.6
2025.02
96.3
2025.02
96.2
2025.02
95.2
2025.02
94.7
2025.02
90.4
2025.02
90.4
2025.02
82.6
2025.02
76.5
2025.02
75.7
2025.02
60.1
2025.02
52.7
2025.02
0
2025.02
0