ProUIE: A Macro-to-Micro Progressive Learning Method for LLM-based Universal Information Extraction

About

LLM-based universal information extraction (UIE) methods often rely on additional information beyond the original training data, which increases training complexity yet often yields limited gains. To address this, we propose ProUIE, a Macro-to-Micro progressive learning approach that improves UIE without introducing any external information. ProUIE consists of three stages: (i) macro-level Complete Modeling (CM), which learns NER, RE, and EE along their intrinsic difficulty order on the full training data to build a unified extraction foundation, (ii) meso-level Streamlined Alignment (SA), which operates on sampled data with simplified target formats, streamlining and regularizing structured outputs to make them more concise and controllable, and (iii) micro-level Deep Exploration (DE), which applies GRPO with stepwise fine-grained rewards (SFR) over structural units to guide exploration and improve performance. Experiments on 36 public datasets show that ProUIE consistently improves unified extraction, outperforming strong instruction-tuned baselines on average for NER and RE while using a smaller backbone, and it further demonstrates clear gains in large-scale production-oriented information extraction.

Wenda Liu, Zhigang Song, Shuai Nie, Guangyao Liu, Lisung Chen, Binyu Yang, Yaran Chen, Peng Zhou, Hongzhen Wang, Yuchen Liu, Wenyue Hu, Jiaming Xu, Runyu Shi, Ying Huang• 2026

Related benchmarks

Task	Dataset	Result
Named Entity Recognition	CoNLL 03	--	135
Named Entity Recognition	OntoNotes	F1-score88.33	121
Named Entity Recognition	BC5CDR	F1 Score88.93	102
Relation Extraction	SciERC	Relation Strict F146.76	68
Relation Extraction	CoNLL 04	F173.17	59
Named Entity Recognition	multiNERD	Entity F195.91	50
Named Entity Recognition	tweetNER7	Entity F166.44	49
Named Entity Recognition	bc2gm	Entity F183.87	48
Named Entity Recognition	FabNER	Entity F180.49	45
Named Entity Recognition	Broad Twitter Corpus	Entity F10.7927	42

Showing 10 of 37 rows

Other info

Follow for update

@wizwand_team Discord