Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Evaluation on CValues
Loading...
89.25
Accuracy
ROSE
63.9156
70.4928
77.07
83.6472
Feb 19, 2024
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ROSE
Model=Qwen-chat-7B, De...
2024.02
89.25
Qwen-chat-7B
Model=Qwen-chat-7B, De...
2024.02
89.19
ROSE
Model=InternLM-chat-7B...
2024.02
85.92
InternLM-chat-7B
Model=InternLM-chat-7B...
2024.02
85.28
ROSE
Model=Chinese-Alpaca-7...
2024.02
84.22
Chinese-Alpaca-7B
Model=Chinese-Alpaca-7...
2024.02
80.37
ROSE
Model=Alpaca-7B, Decod...
2024.02
72.49
Alpaca-7B
Model=Alpaca-7B, Decod...
2024.02
68.81
ROSE
Model=Vicuna-7B, Decod...
2024.02
67.82
Vicuna-7B
Model=Vicuna-7B, Decod...
2024.02
64.89
Feedback
Search any
task
Search any
task