Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Hierarchy Robustness on Human Red-teaming IH-Challenge
Loading...
271
Number of Tasks
GPT-5-Mini
264.76
266.38
268
269.62
Mar 11, 2026
Number of Tasks
Average Attempts per Task
Success Rate
Success Rate per Attempt
Updated 1mo ago
Evaluation Results
Method
Method
Links
Number of Tasks
Average Attempts per Task
Success Rate
Success Rate per Attempt
GPT-5-Mini
system mitigation=none
2026.03
271
32.84
36.2
1.5
GPT-5-Mini-R + Monitor
output monitor=true
2026.03
268
43.08
7.1
0.2
GPT-5-Mini-R
system mitigation=none
2026.03
265
52.39
11.7
0.4
Feedback
Search any
task
Search any
task