Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on HCoT
Loading...
98
OverRefusal Score
STAR-1
35.6
51.8
68
84.2
May 9, 2026
OverRefusal Score
Updated 23d ago
Evaluation Results
Method
Method
Links
OverRefusal Score
STAR-1
Model Backbone=DeepSee...
2026.05
98
STAR-1
Model Backbone=DeepSee...
2026.05
98
Base
Model Backbone=DeepSee...
2026.05
96
SafeChain
Model Backbone=DeepSee...
2026.05
96
Base
Model Backbone=DeepSee...
2026.05
96
Base
Model Backbone=DeepSee...
2026.05
94
SafeChain
Model Backbone=DeepSee...
2026.05
92
SafeChain
Model Backbone=DeepSee...
2026.05
88
STAR-1
Model Backbone=DeepSee...
2026.05
78
SInternal
Model Backbone=DeepSee...
2026.05
64
SInternal
Model Backbone=DeepSee...
2026.05
62
SInternal
Model Backbone=DeepSee...
2026.05
38
Feedback
Search any
task
Search any
task