Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on JBB-Behaviors
Loading...
99.3
Safety Score
gpt-oss-120b
61.964
71.657
81.35
91.043
Feb 6, 2026
Safety Score
Helpfulness Score
CoSA Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Safety Score
Helpfulness Score
CoSA Score
gpt-oss-120b
2026.02
99.3
33.7
32.4
TR1S-8b/pos
2026.02
99
33.7
32.3
gpt-oss-20b
2026.02
98.7
29.8
27.2
PACT
2026.02
98.7
35.6
34
DSR1-0528-8b
2026.02
77.8
53.1
18.6
Q3-235b
2026.02
72.4
64.8
14.9
DSR1-0528
2026.02
71.5
66.6
16.1
Q3-8b
2026.02
65.4
66.4
13
TR1S-8b/adh
2026.02
63.4
67.1
9.7
Feedback
Search any
task
Search any
task