Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Harmful Content Safety on HarmBench
Loading...
98.4
Evaluation Score (avg@4)
Self-ReSET
20.4
40.65
60.9
81.15
May 9, 2026
Evaluation Score (avg@4)
Updated 22d ago
Evaluation Results
Method
Method
Links
Evaluation Score (avg@4)
Self-ReSET
Base Model=DeepSeek-R1...
2026.05
98.4
STAR-1
Base Model=Qwen3-8B
2026.05
98.3
Self-ReSET
Base Model=Qwen3-8B
2026.05
98.1
RECAP
Base Model=DeepSeek-R1...
2026.05
97.8
DAPO
Base Model=DeepSeek-R1...
2026.05
97.3
STAR-1
Base Model=DeepSeek-R1...
2026.05
95
STAR-1
Base Model=DeepSeek-R1...
2026.05
92.4
DAPO
Base Model=Qwen3-8B
2026.05
89
RECAP
Base Model=DeepSeek-R1...
2026.05
86.9
DAPO
Base Model=DeepSeek-R1...
2026.05
86.1
Self-ReSET
Base Model=DeepSeek-R1...
2026.05
85.1
RECAP
Base Model=Qwen3-8B
2026.05
84.7
Safechain
Base Model=Qwen3-8B
2026.05
66.6
Base
Base Model=Qwen3-8B
2026.05
63.3
Safechain
Base Model=DeepSeek-R1...
2026.05
50.1
Safechain
Base Model=DeepSeek-R1...
2026.05
39.2
Base
Base Model=DeepSeek-R1...
2026.05
30.1
Base
Base Model=DeepSeek-R1...
2026.05
23.4
Feedback
Search any
task
Search any
task