Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on BeaverTails
Loading...
99.4
Safety Score
TELLME NT-Xent
66.224
74.837
83.45
92.063
Feb 7, 2025
Apr 15, 2025
Jun 21, 2025
Aug 27, 2025
Nov 2, 2025
Jan 8, 2026
Mar 17, 2026
Safety Score
Updated 6d ago
Evaluation Results
Method
Method
Links
Safety Score
TELLME NT-Xent
Base Model=Gemma-2-9B-...
2025.02
99.4
TELLME
Base Model=Gemma-2-9B-IT
2025.02
99.1
Origin
Base Model=Gemma-2-9B-IT
2025.02
98
SFT
Base Model=Gemma-2-9B-...
2025.02
97.6
IQuest-Coder-V1-40B-Thinking
Parameters=40B, Type=T...
2026.03
76.7
Qwen3-Coder-480B-A35B-Instruct
Parameters=480B-A35B,...
2026.03
70.5
Qwen2.5-Coder-32B-Instruct
Parameters=32B, Type=I...
2026.03
68
IQuest-Coder-V1-40B-Instruct
Parameters=40B, Type=I...
2026.03
67.5
Feedback
Search any
task
Search any
task