Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Overrefusal Evaluation on Fortress OR
Loading...
97.6
Helpfulness Score
RECAP
81.792
85.896
90
94.104
Oct 1, 2025
Helpfulness Score
Updated 6d ago
Evaluation Results
Method
Method
Links
Helpfulness Score
RECAP
Backbone=DSQwen-14B
2025.10
97.6
SafeChain
Backbone=DSQwen-14B
2025.10
96.4
SFT
Backbone=DSQwen-14B
2025.10
96
Original
Backbone=DSQwen-14B
2025.10
95
DAPO
Backbone=DSQwen-14B
2025.10
95
STAR
Backbone=DSQwen-14B
2025.10
93.2
RECAP
Backbone=DSLlama-8B
2025.10
91.8
Original
Backbone=DSLlama-8B
2025.10
90
STAR
Backbone=DSLlama-8B
2025.10
86
SafeChain
Backbone=DSLlama-8B
2025.10
84.5
DAPO
Backbone=DSLlama-8B
2025.10
82.8
SFT
Backbone=DSLlama-8B
2025.10
82.4
Feedback
Search any
task
Search any
task