Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Safety Control on DialoGPT large

0.647Safety-Quality Score

SafeCtrl-RL

0.00220.16960.3370.5044May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.05
0.6470.36
2026.05
0.5890.302
2026.05
0.5780.29
2026.05
0.5670.28
2026.05
0.5280.241
2026.05
0.5260.239
2026.05
0.5010.213
2026.05
0.4970.21
2026.05
0.4930.205
2026.05
0.4770.19
2026.05
0.4390.152
2026.05
0.4080.121
2026.05
0.083-0.204
2026.05
0.082-0.205
2026.05
0.08-0.207
2026.05
0.077-0.21
2026.05
0.027-0.26