Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Personalized Text Generation on Reddit Topic Writing
Loading...
95.4
Accuracy (User-level, GT)
PARL-0
83.544
86.622
89.7
92.778
May 29, 2026
Accuracy (User-level, GT)
Accuracy (User-level, Non)
Accuracy (User-level, RAG)
Accuracy (User-level, Non-Think)
Accuracy (User-level, RAG-Think)
Accuracy (User-level, SFT)
Accuracy (User-level, GRPO)
Accuracy (User-level, SFT+GRPO)
Max Difference
User Coverage
Updated 2d ago
Evaluation Results
Method
Method
Links
Accuracy (User-level, GT)
Accuracy (User-level, Non)
Accuracy (User-level, RAG)
Accuracy (User-level, Non-Think)
Accuracy (User-level, RAG-Think)
Accuracy (User-level, SFT)
Accuracy (User-level, GRPO)
Accuracy (User-level, SFT+GRPO)
Max Difference
User Coverage
PARL-0
Optimization=Scoring f...
2026.05
95.4
97.2
98.5
50.5
81.8
97.1
98.2
97.1
0.031
100
LM-8B
Model=Qwen3-8B
2026.05
89.2
42.2
61.5
50.6
70.7
80.3
89.1
89.9
0.007
75
PARL-B
Optimization=Margin-on...
2026.05
86
10.1
23.4
18.4
30
79
48.4
82.4
0.036
99.4
PARL-A
Optimization=GT-scaled...
2026.05
84.8
21.9
26.7
19.6
29.6
76
26.8
77
0.078
99.9
LM-235B
Model=Qwen3-235B-A22B-...
2026.05
84
37.1
53.1
40.8
61.7
76.4
79.8
83.6
0.004
81.5
Feedback
Search any
task
Search any
task