MultiMUC: Multilingual Template Filling on MUC-4

About

We introduce MultiMUC, the first multilingual parallel corpus for template filling, comprising translations of the classic MUC-4 template filling benchmark into five languages: Arabic, Chinese, Farsi, Korean, and Russian. We obtain automatic translations from a strong multilingual machine translation system and manually project the original English annotations into each target language. For all languages, we also provide human translations for sentences in the dev and test splits that contain annotated template arguments. Finally, we present baselines on MultiMUC both with state-of-the-art template filling models and with ChatGPT.

William Gantt, Shabnam Behzad, Hannah YoungEun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme, Mahsa Yarmohammadi• 2024

Related benchmarks

Task	Dataset	Result	Rank
Document-level Information Extraction	MUC	F1 Score22.41		9
Document-level Information Extraction	MultiMUC (averaged across languages)	F1 Score12.93		9

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord