Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Safety Evaluation on WildChat (test)
Loading...
69.85
WildChat Score
SFT-DPO + LoRA
12.182
27.1535
42.125
57.0965
Feb 8, 2026
WildChat Score
Updated 4d ago
Evaluation Results
Method
Method
Links
WildChat Score
SFT-DPO + LoRA
Base Model=Qwen2.5-7B-...
2026.02
69.85
SFT
Base Model=Qwen2.5-7B-...
2026.02
64.2
SFT-DPO
Base Model=Llama3.1-8B...
2026.02
59.4
SFT
Base Model=Llama3.1-8B...
2026.02
50
DPO + OGPSA
Base Model=Qwen2.5-7B-...
2026.02
49.4
SFT + OGPSA
Base Model=Llama3.1-8B...
2026.02
47
DPO
Base Model=Llama3.1-8B...
2026.02
42.8
SFT + LoRA
Base Model=Llama3.1-8B...
2026.02
42.6
DPO + OGPSA
Base Model=Llama3.1-8B...
2026.02
38.4
SFT + Merge
Base Model=Llama3.1-8B...
2026.02
31.2
Instruct Baseline
Base Model=Qwen2.5-7B-...
2026.02
16
Instruct Baseline
Base Model=Llama3.1-8B...
2026.02
15.8
SFT + General Data
Base Model=Llama3.1-8B...
2026.02
14.4
Feedback
Search any
task
Search any
task