Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Pairwise preference comparison on HH-RLHF held-out (test)
Loading...
53.02
Win Rate
DP-RLHF
51.8136
52.1268
52.44
52.7532
Mar 23, 2026
Win Rate
Reward Accuracy
Updated 25d ago
Evaluation Results
Method
Method
Links
Win Rate
Reward Accuracy
DP-RLHF
Privacy budget (ε)=0.5
2026.03
53.02
-
DP-RLHF
Privacy budget (ε)=1.0
2026.03
52.85
-
DP-RLHF
Privacy budget (ε)=2.0
2026.03
52.82
-
DP-DPO
Privacy budget (ε)=2.0
2026.03
51.99
-
DP-DPO
Privacy budget (ε)=1.0
2026.03
51.92
-
DP-DPO
Privacy budget (ε)=0.5
2026.03
51.86
-
Private RM
Privacy budget (ε)=0.5
2026.03
-
58.93
Private RM
Privacy budget (ε)=1.0
2026.03
-
59.44
Private RM
Privacy budget (ε)=2.0
2026.03
-
59.69
Feedback
Search any
task
Search any
task