milIE: Modular & Iterative Multilingual Open Information Extraction

About

Open Information Extraction (OpenIE) is the task of extracting (subject, predicate, object) triples from natural language sentences. Current OpenIE systems extract all triple slots independently. In contrast, we explore the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and therefore achieve a better overall extraction. Based on this hypothesis, we propose a neural OpenIE system, milIE, that operates in an iterative fashion. Due to the iterative nature, the system is also modular -- it is possible to seamlessly integrate rule based extraction systems with a neural end-to-end system, thereby allowing rule based systems to supply extraction slots which milIE can leverage for extracting the remaining slots. We confirm our hypothesis empirically: milIE outperforms SOTA systems on multiple languages ranging from Chinese to Arabic. Additionally, we are the first to provide an OpenIE test dataset for Arabic and Galician.

Bhushan Kotnis, Kiril Gashteovski, Daniel O\~noro Rubio, Vanesa Rodriguez-Tembras, Ammar Shaker, Makoto Takamoto, Mathias Niepert, Carolin Lawrence• 2021

Related benchmarks

Task	Dataset	Result
Open Information Extraction	BenchIE binary English	F1 Score27.9	10
Open Information Extraction	CaRB-nary English	F1 Score45	10
Open Information Extraction	BenchIE Chinese (test)	F1 Score20.5	5
Open Information Extraction	BenchIE German (test)	F1 Score10.3	5
Open Information Extraction	BenchIE Galician (test)	F1 Score18.3	5
Open Information Extraction	CaRB Spanish lexical match (test)	F1 Score64.2	4
Open Information Extraction	CaRB Portuguese lexical match (test)	F1 Score65.6	4
Open Information Extraction	CaRB Spanish-Clean lexical match (test)	F1 Score59.5	4
Open Information Extraction	BenchIE Arabic (test)	F1 Score0.075	3

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord