Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Harmlessness on Harmlessness
Loading...
96
Average Win Rate
Curriculum-RLAIF
78.32
82.91
87.5
92.09
May 26, 2025
Average Win Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Win Rate
Curriculum-RLAIF
Base Model=Qwen2.5-32B...
2025.05
96
Implicit Eval. (DPO)
Base Model=Qwen2.5-32B...
2025.05
94
Curriculum-RLAIF
Base Model=LLaMA-3-8B,...
2025.05
93
Internal Eval.
Base Model=Qwen2.5-32B...
2025.05
93
Curriculum-RLAIF
Base Model=Gemma-1-2B,...
2025.05
92
Conventional RLAIF
Base Model=Qwen2.5-32B...
2025.05
91
Internal Eval.
Base Model=Gemma-1-2B,...
2025.05
90
Implicit Eval. (DPO)
Base Model=LLaMA-3-8B,...
2025.05
90
External Eval.
Base Model=Qwen2.5-32B...
2025.05
90
Internal Eval.
Base Model=LLaMA-3-8B,...
2025.05
89
RLCD
Base Model=Qwen2.5-32B...
2025.05
89
External Eval.
Base Model=Gemma-1-2B,...
2025.05
88
Conventional RLAIF
Base Model=LLaMA-3-8B,...
2025.05
88
CAI
Base Model=Qwen2.5-32B...
2025.05
88
Implicit Eval. (DPO)
Base Model=Gemma-1-2B,...
2025.05
86
RLCD
Base Model=LLaMA-3-8B,...
2025.05
85
External Eval.
Base Model=LLaMA-3-8B,...
2025.05
85
RLCD
Base Model=Gemma-1-2B,...
2025.05
83
Conventional RLAIF
Base Model=Gemma-1-2B,...
2025.05
83
CAI
Base Model=LLaMA-3-8B,...
2025.05
83
CAI
Base Model=Gemma-1-2B,...
2025.05
79
Feedback
Search any
task
Search any
task