Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Model Alignment on HH-RLHF 0-shot (test)

62.68Harmlessness BLEU

ChatGPT

5.916820.653435.3950.1266Apr 2, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2026.04
62.6810.2973.0170.7911.8675.1168.611.4174.54
2026.04
30.93.3363.534.63.964.833.63.7464.45
2026.04
233.0666.6733.473.7465.6930.643.5465.96
2026.04
10.511.853.2318.742.0246.9716.511.9648.66
2026.04
8.11.7353.5114.181.8745.5712.541.8347.72