Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

About

Generative foundation models are susceptible to implicit biases that can arise from extensive unsupervised training data. Such biases can produce suboptimal samples, skewed outcomes, and unfairness, with potentially serious consequences. Consequently, aligning these models with human ethics and preferences is an essential step toward ensuring their responsible and effective deployment in real-world applications. Prior research has primarily employed Reinforcement Learning from Human Feedback (RLHF) to address this problem, where generative models are fine-tuned with RL algorithms guided by a human-feedback-informed reward model. However, the inefficiencies and instabilities associated with RL algorithms frequently present substantial obstacles to the successful alignment, necessitating the development of a more robust and streamlined approach. To this end, we introduce a new framework, Reward rAnked FineTuning (RAFT), designed to align generative models effectively. Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples. Our studies show that RAFT can effectively improve the model performance in both reward learning and other automated metrics in both large language models and diffusion models.

Hanze Dong, Wei Xiong, Deepanshu Goyal, Yihan Zhang, Winnie Chow, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningMATH 500
pass@178.68
153
Mathematical ReasoningMinerva
Pass@145.29
138
Mathematical ReasoningOlympiad Bench
Pass@1 Accuracy48.34
115
Mathematical ReasoningAMC23
Pass@161.8
43
Mathematical ReasoningAIME 24
Pass@122.93
39
Scientific and General ReasoningMMLU-Pro
Pass@159.53
21
Scientific and General ReasoningGPQA Diamond
Pass@137.47
21
Scientific and General ReasoningTheorem QA
Pass@141.05
18
Text-to-Image GenerationDiffusionDB Real User Prompts 466 prompts (test)
Win Count1.34e+3
7
Text-to-Image GenerationMT Bench 90 prompts (test)
Total Wins578
7
Showing 10 of 10 rows

Other info

Follow for update