Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling

About

Multi-hop question answering faces substantial challenges due to data sparsity, which increases the likelihood of language models learning spurious patterns. To address this issue, prior research has focused on diversifying question generation through content planning and varied expression. However, these approaches often emphasize generating simple questions and neglect the integration of essential knowledge, such as relevant sentences within documents. This paper introduces the Knowledge Composition Sampling (KCS), an innovative framework designed to expand the diversity of generated multi-hop questions by sampling varied knowledge compositions within a given context. KCS models the knowledge composition selection as a sentence-level conditional prediction task and utilizes a probabilistic contrastive loss to predict the next most relevant piece of knowledge. During inference, we employ a stochastic decoding strategy to effectively balance accuracy and diversity. Compared to competitive baselines, our KCS improves the overall accuracy of knowledge composition selection by 3.9%, and its application for data augmentation yields improvements on HotpotQA and 2WikiMultihopQA datasets. Our code is available at: https://github.com/yangfanww/kcs.

Yangfan Wang, Jie Liu, Chen Tang, Lian Yan, Jingchi Jiang• 2025

Related benchmarks

TaskDatasetResultRank
Multi-hop Question Answering2WikiMultihopQA
EM66.5
387
Knowledge composition selectionHotpotQA
Precision @264.18
23
Knowledge composition selection2WikiMultihopQA
Precision @ K=284.79
23
Multi-hop Question AnsweringHotpotQA
EM61
10
Multi-hop Question GenerationHotpotQA
Pairwise BLEU68.1
4
Showing 5 of 5 rows

Other info

Follow for update