Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Safety Evaluation on PS-Bench

29.1ASR (Hate Speech)

MemU (with intent-legitimation suppression)

28.46832.7343741.266Jan 25, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
29.112.91111.237.516.530.86.219.4
2026.01
29.714.410.810.537.818.630.2519.63
2026.01
32.114.213.115.543.324.342.48.224.14
2026.01
3616116522234823.13
2026.01
40.321.818.417.847.23145.77.628.73
2026.01
42.216.619.61851.629.243.37.228.46
2026.01
44.919.419.119.350.330.444.59.429.66