A Generative Model for Joint Multiple Intent Detection and Slot Filling

About

In task-oriented dialogue systems, spoken language understanding (SLU) is a critical component, which consists of two sub-tasks, intent detection and slot filling. Most existing methods focus on the single-intent SLU, where each utterance only has one intent. However, in real-world scenarios users usually express multiple intents in an utterance, which poses a challenge for existing dialogue systems and datasets. In this paper, we propose a generative framework to simultaneously address multiple intent detection and slot filling. In particular, an attention-over-attention decoder is proposed to handle the variable number of intents and the interference between the two sub-tasks by incorporating an inductive bias into the process of multi-task learning. Besides, we construct two new multi-intent SLU datasets based on single-intent utterances by taking advantage of the next sentence prediction (NSP) head of the BERT model. Experimental results demonstrate that our proposed attention-over-attention generative model achieves state-of-the-art performance on two public datasets, MixATIS and MixSNIPS, and our constructed datasets.

Liz Li, Wei Zhu• 2026

Related benchmarks

Task	Dataset	Result
Slot Filling and Intent Detection	MixSNIPS	Overall Accuracy87.4	31
Joint Multiple Intent Detection and Slot Filling	MixATIS	Slot F189.2	8
Joint Multiple Intent Detection and Slot Filling	MultiATIS (test)	Slot F10.954	4
Joint Multiple Intent Detection and Slot Filling	MultiSNIPS (test)	Slot F198.1	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord