Share your thoughts, 1 month free Claude Pro on usSee more

Machine Translation on WMT20

0.2Reward

RLOO

Updated 5mo ago

Evaluation Results

Method	Links
RLOO 2025.03		0.2	36	0.005	-
ZOPrO 2025.03		0.15	6	0.025	-
PPO 2025.03		0.11	115	0.0015	-