HS

Benchmarks

Task Name	Dataset Name	SOTA Result
Narrative Understanding	HS	First-Token Accuracy87.8	24
Commonsense Reasoning	HS	Accuracy85.87	23
Human Safety / Alignment	HS	HS Average1.84	18
Code Generation	HS (test)	BLEU67.1	10
Query representation clustering	HS (test)	ARI0.0723	3

Showing 5 of 5 rows