Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Toxicity Mitigation on ATTAQ

0.122Average Max Toxicity

M+

0.09960.25080.4020.5532Nov 11, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.11
0.122010.8123.7
2025.11
0.2075.89.2911.8
2025.11
0.33117.511.2616
2025.11
0.364208.3512.9
2025.11
0.42635.88.3612.7
2025.11
0.46841.77.6310
2025.11
0.64380.87.4815
2025.11
0.661757.3413.7
2025.11
0.68284.27.2112.3