HARMFULQA

Benchmarks

Task Name	Dataset Name	SOTA Result
Jailbreak Attack	HarmfulQA	JADES56	33
Harmlessness evaluation	HarmfulQA	Helpfulness Score69.4	33
LLM Safety and Informativeness Evaluation	HarmfulQA	Safety Rate98.1	15
Safety Evaluation	HarmfulQA v1 (test)	Compliance Rate16.8	9
Safety Evaluation	HARMFULQA various domains	Safety Score (Chinese)19.17	8
Refusal Evaluation	HarmfulQA	Refusal Rate85.31	7
Red-Teaming (Attack Success Rate)	HARMFULQA	ASR0.702	7
Jailbreak Attack Evaluation	HarmfulQA	ASR16	6
Safety and Informativeness Evaluation	HarmfulQA Social science	Safety Score90	4
Safety and Informativeness Evaluation	HarmfulQA Science and Technology	Safety Score95	4
Safety and Informativeness Evaluation	HarmfulQA Philosophy and Ethics	Safety Score80	4
Safety and Informativeness Evaluation	HarmfulQA Mathematics and Logic	Safety Score76.7	4
Safety and Informativeness Evaluation	HarmfulQA Literature and Language	Safety Score100	4
Safety and Informativeness Evaluation	HarmfulQA History and Culture	Safety Score90	4
Safety and Informativeness Evaluation	HarmfulQA Health and Medicine	Safety Score85	4
Safety and Informativeness Evaluation	HarmfulQA Geography and Environment	Safety Rate95	4
Safety and Informativeness Evaluation	HarmfulQA Education and Pedagogy	Safety Score100	4
Safety and Informativeness Evaluation	HarmfulQA Business and Economic	Safety Rate91	4
Language Modeling	HarmfulQA	PPL83.41	1

Showing 19 of 19 rows