Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Reward Modeling on E-commerce customer service scenario
Loading...
3.6
Score
RM-Distiller + Rule
3.08
3.215
3.35
3.485
Jan 20, 2026
Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Score
RM-Distiller + Rule
Policy=Qwen-3-8B
2026.01
3.6
RM-Distiller
Policy=Qwen-3-8B
2026.01
3.5
BT Classifer
Policy=Qwen-3-8B
2026.01
3.4
Baseline
Policy=Qwen-3-8B
2026.01
3.1
Feedback
Search any
task
Search any
task