Share your thoughts, 1 month free Claude Pro on usSee more

Harmful Content Detection on PHEME Known Attacks: DeepWordBug, TFAdjusted, TREPAT (test)

85.59Accuracy

LLM-SGA/ARHOCD

Updated 5mo ago

Evaluation Results

Method	Links
LLM-SGA/ARHOCD 2025.12		85.59	84.42	85.54	84.85	10.59
LLM-SGA/ARHOCD 2025.12		85.49	84.32	85.45	84.76	10.59
LLM-puri 2025.12		83.66	82.73	82.09	82.38	12.55
MRAT 2025.12		83.54	82.74	81.71	82.15	13.96
EnsSel 2025.12		81.81	81.32	79.13	79.93	15.35
AT 2025.12		81.59	80.82	79.23	79.85	15.03
Det&Res 2025.12		81.38	80.42	79.27	79.74	14.99
OutReg 2025.12		81.15	80.6	78.36	79.16	15.26
ARText 2025.12		80.95	79.62	79.82	79.71	12.79
ARDEL 2025.12		80.78	80.1	78.08	78.81	15.03