RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

About

Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding, reasoning, and interaction. However, existing MLLMs prevalently suffer from serious hallucination problems, generating text that is not factually grounded in associated images. The problem makes existing MLLMs untrustworthy and thus impractical in real-world (especially high-stakes) applications. To address the challenge, we present RLHF-V, which enhances MLLM trustworthiness via behavior alignment from fine-grained correctional human feedback. Specifically, RLHF-V collects human preference in the form of segment-level corrections on hallucinations, and performs dense direct preference optimization over the human feedback. Comprehensive experiments on five benchmarks in both automatic and human evaluation show that, RLHF-V can enable substantially more trustworthy MLLM behaviors with promising data and computation efficiency. Remarkably, using 1.4k annotated data samples, RLHF-V significantly reduces the hallucination rate of the base MLLM by 34.8%, outperforming the concurrent LLaVA-RLHF trained on 10k annotated data. The final model achieves state-of-the-art performance in trustworthiness among open-source MLLMs, and shows better robustness than GPT-4V in preventing hallucinations aroused from over-generalization. We open-source our code, model, and data at https://github.com/RLHF-V/RLHF-V.

Tianyu Yu, Yuan Yao, Haoye Zhang, Taiwen He, Yifeng Han, Ganqu Cui, Jinyi Hu, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun, Tat-Seng Chua• 2023

Related benchmarks

Task	Dataset	Result
Visual Question Answering	VizWiz	Accuracy54.2	1820
Visual Question Answering	GQA	--	1425
Multimodal Understanding	MMBench	--	847
Multimodal Reasoning	MM-Vet	MM-Vet Score30.9	517
Hallucination Evaluation	CHAIR	CHAIR_s44.6	393
Hallucination Evaluation	MMHal-Bench	MMHal Score2.59	306
Hallucination Evaluation	AMBER	--	222
Hallucination Evaluation	POPE	--	217
Vision Understanding	MMBench	--	141
Multimodal Hallucination Evaluation	MMHal-Bench	Average Score2.81	129

Showing 10 of 42 rows

Other info

Code

Follow for update

@wizwand_team Discord