Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pairwise Evaluation on HH-RLHF (test)

95.2Test Accuracy

pairwise evaluator

63.79271.94680.188.254Apr 10, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
95.21.64450.99721.5195.2
84.5----
2026.04
72----
65----