Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safety Evaluation on Manual Evaluation Set
Loading...
3.83
Average Safety Score
MSelf-MOA
2.0308
2.4979
2.965
3.4321
Mar 7, 2026
Average Safety Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Safety Score
MSelf-MOA
Model Family=qwen
2026.03
3.83
MPKU-RLHF
Model Family=llama
2026.03
3.8
MSelf-MOA
Model Family=llama
2026.03
3.67
MPKU-RLHF
Model Family=gemma-3
2026.03
3.67
MPKU-RLHF
Model Family=qwen
2026.03
3.37
MSelf-MOA
Model Family=gemma-3
2026.03
3.3
MSelf-MOA
Model Family=gemma-2
2026.03
3.2
Mbase
Model Family=gemma-3
2026.03
2.97
Mbase
Model Family=llama
2026.03
2.77
Mbase
Model Family=qwen
2026.03
2.43
MPKU-RLHF
Model Family=gemma-2
2026.03
2.13
Mbase
Model Family=gemma-2
2026.03
2.1
Feedback
Search any
task
Search any
task