Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Personalized Reward Modeling on Reddit TLDR 100 examples Overall
Loading...
69.6
User-level Accuracy
MRM
62.32
64.21
66.1
67.99
Jan 26, 2026
User-level Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
User-level Accuracy
MRM
Base Reward Model=Skyw...
2026.01
69.6
MRM
Base Reward Model=Skyw...
2026.01
68.8
LoRe
2026.01
68.3
SynthesizeMe
Adaptation Protocol=FT
2026.01
68.1
GPO
2026.01
68
BT
2026.01
67.8
VPL
2026.01
67.6
SynthesizeMe
Adaptation Protocol=ICL
2026.01
66.5
PAL
2026.01
64.5
Skywork-Reward V2
Base Reward Model=V2
2026.01
64.4
Skywork-Reward V1
Base Reward Model=V1
2026.01
62.6
Feedback
Search any
task
Search any
task