Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text Safety Filter Bypass on NSFW-200 Target Filter: GPT-4.1 1.0 (cross-filter transfer)
Loading...
97.5
Bypass Rate
OptJail
94.9
95.575
96.25
96.925
May 25, 2025
Bypass Rate
Updated 8d ago
Evaluation Results
Method
Method
Links
Bypass Rate
OptJail
Optimization Filter=GP...
2025.05
97.5
OptJail
Optimization Filter=Sh...
2025.05
97
OptJail
Optimization Filter=De...
2025.05
95
Feedback
Search any
task
Search any
task