Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reward Prediction on Human preference annotation (test)

30Effort

Qwen2.5-7B-Instr.

28.439.25060.8Jan 23, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
30262828
2026.01
44223032
2026.01
44323537
2026.01
52253136
2026.01
56544953
2026.01
57252937
2026.01
61534553
70767072