Share your thoughts, 1 month free Claude Pro on usSee more

Generative AI Output Safety on HarmBench and AdvBench (test)

82.88Safe Rate

Reliable Consensus Sampling

Updated 5mo ago

Evaluation Results

Method	Links
Reliable Consensus Sampling 2025.12		82.88	0	24.26
Reliable Consensus Sampling 2025.12		81.49	0	22.21
Reliable Consensus Sampling 2025.12		77.42	0	21.63
Reliable Consensus Sampling 2025.12		67.9	0	18.42
Consensus Sampling 2025.12		20.39	71.79	22.6
Consensus Sampling 2025.12		15.83	72.52	26.73
Consensus Sampling 2025.12		12.56	75.75	23.59
Consensus Sampling 2025.12		1.45	82.95	21.43