Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

About

Instruction tuning fine-tunes pre-trained Multi-modal Large Language Models (MLLMs) to handle real-world tasks. However, the rapid expansion of visual instruction datasets introduces data redundancy, leading to excessive computational costs. We propose a collaborative framework, DataTailor, which leverages three key principles--informativeness, uniqueness, and representativeness--for effective data selection. We argue that a valuable sample should be informative of the task, non-redundant, and represent the sample distribution (i.e., not an outlier). We further propose practical ways to score against each principle, which automatically adapts to a given dataset without tedious hyperparameter tuning. Comprehensive experiments on various benchmarks demonstrate that DataTailor achieves 101.3% of the performance of full-data fine-tuning with only 15% of the data, significantly reducing computational costs while maintaining superior results. This exemplifies the "Less is More" philosophy in MLLM development. The code and data is available in this \href{https://github.com/Yuqifan1117/DataTailor}{URL}.

Qifan Yu, Zhebei Shen, Zhongqi Yue, Yang Wu, Bosheng Qin, Wenqiao Zhang, Yunfei Li, Juncheng Li, Siliang Tang, Yueting Zhuang• 2024

Related benchmarks

TaskDatasetResultRank
Object HallucinationPOPE Popular--
372
Visual Question AnsweringGQA (test-dev)
Accuracy49.5
236
Object Hallucination EvaluationPOPE Adversarial
Accuracy85.3
159
Object Hallucination EvaluationPOPE (Random)
Accuracy85.3
152
Object Hallucination EvaluationPOPE (test)
Accuracy85.3
107
Visual Question AnsweringVizwiz (val)
VQA Score31.8
66
Multimodal Question AnsweringScienceQA
Accuracy71
61
Multimodal UnderstandingMME Perception--
59
Multimodal UnderstandingMME Cognition
Score319.2
45
Visual Question AnsweringVizWiz
Acc49.5
31
Showing 10 of 21 rows

Other info

Follow for update