DOC: Deep Open Classification of Text Documents

About

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

Lei Shu, Hu Xu, Bing Liu• 2017

Related benchmarks

Task	Dataset	Result
Unknown Intent Detection	ATIS (test)	Macro F162.8	20
Unknown Intent Detection	Snips (test)	Macro F172.5	15
Unknown Intent Detection	StackOverflow 50% seen classes (test)	Accuracy61.62	11
Open Intent Classification	BANKING 50% known classes (test)	Accuracy77.16	10
Relation Classification	FewRel	Accuracy93.25	8
open-set relation extraction	FewRel (test)	Accuracy63.96	8
Relation Classification	TACRED n known relations	Accuracy93.7	8
open-set relation extraction	TACRED (test)	Accuracy0.7008	8
Unknown Intent Detection	StackOverflow 25% seen classes (test)	Accuracy60.68	6
Unknown Intent Detection	M-CID-EN 25% seen classes (test)	Accuracy49.32	6

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord