Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Adversarial Tutor Leakage on HumanEval (test)
Loading...
17.3
Leak Rate
Manually Defined Prompts
14.48
33.515
52.55
71.585
Apr 20, 2026
Leak Rate
Average Turns
Updated 1mo ago
Evaluation Results
Method
Method
Links
Leak Rate
Average Turns
Manually Defined Prompts
Attack Type=Manually D...
2026.04
17.3
6.347
Multi-Agent Student
Attack Type=Multi-Agen...
2026.04
18.9
10.806
Base Student Adv. Agent
Attack Type=Base Stude...
2026.04
24.4
12.175
Student w/ Reasoning
Attack Type=Student w/...
2026.04
24.4
10.6
Finetuned Adv. Agent
Attack Type=Finetuned...
2026.04
40.9
6.866
Manually Defined Prompts
Attack Type=Manually D...
2026.04
74.6
3.85
Base Student Adv. Agent
Attack Type=Base Stude...
2026.04
76.2
4.952
Student w/ Reasoning
Attack Type=Student w/...
2026.04
78
4.039
Multi-Agent Student
Attack Type=Multi-Agen...
2026.04
78
4.516
Finetuned Adv. Agent
Attack Type=Finetuned...
2026.04
87.8
3.062
Feedback
Search any
task
Search any
task