DOC: Deep Open Classification of Text Documents
About
Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.
Lei Shu, Hu Xu, Bing Liu• 2017
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unknown Intent Detection | ATIS (test) | Macro F162.8 | 20 | |
| Unknown Intent Detection | Snips (test) | Macro F172.5 | 15 | |
| Unknown Intent Detection | StackOverflow 50% seen classes (test) | Accuracy61.62 | 11 | |
| Open Intent Classification | BANKING 50% known classes (test) | Accuracy77.16 | 10 | |
| Relation Classification | FewRel | Accuracy93.25 | 8 | |
| open-set relation extraction | FewRel (test) | Accuracy63.96 | 8 | |
| Relation Classification | TACRED n known relations | Accuracy93.7 | 8 | |
| open-set relation extraction | TACRED (test) | Accuracy0.7008 | 8 | |
| Unknown Intent Detection | StackOverflow 25% seen classes (test) | Accuracy60.68 | 6 | |
| Unknown Intent Detection | M-CID-EN 25% seen classes (test) | Accuracy49.32 | 6 |
Showing 10 of 25 rows