Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Personalized Reward Modeling on Lamp-QA (OOD)
Loading...
60
Arts Score
Qwen3-235B-A22B
48.144
51.222
54.3
57.378
Feb 12, 2026
Arts Score
Personalization Score
Social Score
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Arts Score
Personalization Score
Social Score
Average Score
Qwen3-235B-A22B
Model Size=235B-A22B
2026.02
60
65.7
60
61.9
Qwen3-32B
Model Size=32B
2026.02
54.3
60
54.3
56.2
LLaMA3.1-70B
Model Size=70B
2026.02
54.3
65.7
60
60
P-GenRM
Model Size=8B, Inferen...
2026.02
54.3
71.4
65.7
63.8
Qwen3-8B
Model Size=8B
2026.02
48.6
54.3
60
54.3
LLaMA3.1-8B
Model Size=8B
2026.02
48.6
54.3
54.3
52.4
SynthMe-8B
Model Size=8B
2026.02
48.6
65.7
60
58.1
Feedback
Search any
task
Search any
task