Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Unsafe Prompt Detection on ToxicChat (test)

0.815Precision

OpenAI Moderation API

0.218040.373020.5280.68298Feb 21, 2024Apr 21, 2024Jun 21, 2024Aug 20, 2024Oct 20, 2024Dec 19, 2024Feb 18, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
0.8150.1450.246-
2024.02
0.7530.6670.707-
2024.02
0.7440.3960.517-
2025.02
0.650.8810.74894.17
0.6140.1480.238-
2024.02
0.5590.6340.594-
2024.02
0.4750.8310.604-
2025.02
0.440.8320.57683.49
2025.02
0.4230.8590.56793.55
2025.02
0.4180.7160.52843.52
2025.02
0.3950.7810.52535.12
2025.02
0.3810.7510.50645.42
2025.02
0.3380.7160.46152.7
2025.02
0.3330.790.46847.12
2025.02
0.2420.570.34117.2
2024.02
0.2410.8220.373-