Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Personalized Reward Modeling on LaMP 4
Loading...
17.9
ROUGE@N=30
Probabilistic User RM
14.26
15.205
16.15
17.095
May 9, 2026
ROUGE@N=30
Pairwise Accuracy
Gain
Updated 21d ago
Evaluation Results
Method
Method
Links
ROUGE@N=30
Pairwise Accuracy
Gain
Probabilistic User RM
Cost/query=0.05 s
2026.05
17.9
-
15.5
LLM tournament (Qwen3-4B)
Cost/query=1.63 s
2026.05
15.7
49.9
1.3
Random
2026.05
15.5
50
-
LLM pointwise (GPT-class)
Cost/query=30 calls
2026.05
14.4
-
7.1
Feedback
Search any
task
Search any
task