Share your thoughts, 1 month free Claude Pro on usSee more

Summarization on Summarize from Feedback

70Reward

PPO

Updated 5mo ago

Evaluation Results

Method	Links
PPO 2025.03		70	49	0.014	-
ZOPrO 2025.03		17	14	0.012	-
RLOO 2025.03		15	42	0.0035	-