Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reward Modeling on PersonalRewardBench (test)
Loading...
65.21
Macro Accuracy
P-GenRM-8B
55.8812
58.3031
60.725
63.1469
Feb 12, 2026
Macro Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro Accuracy
P-GenRM-8B
parameter count=8B
2026.02
65.21
o3
2026.02
63.33
SynthesizeMe 70B
parameter count=70B
2026.02
61.51
Fine-tuned BT-70B
parameter count=70B, t...
2026.02
60.64
Llama3.1-70B
parameter count=70B
2026.02
58.27
Llama3.1-8B
parameter count=8B
2026.02
56.24
Feedback
Search any
task
Search any
task