Share your thoughts, 1 month free Claude Pro on usSee more

Personalized Reward Modeling on Lamp-QA (OOD)

60Arts Score

Qwen3-235B-A22B

Updated 5mo ago

Evaluation Results

Method	Links
Qwen3-235B-A22B 2026.02		60	65.7	60	61.9
Qwen3-32B 2026.02		54.3	60	54.3	56.2
LLaMA3.1-70B 2026.02		54.3	65.7	60	60
P-GenRM 2026.02		54.3	71.4	65.7	63.8
Qwen3-8B 2026.02		48.6	54.3	60	54.3
LLaMA3.1-8B 2026.02		48.6	54.3	54.3	52.4
SynthMe-8B 2026.02		48.6	65.7	60	58.1