Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Using a Human-AI Teaming Approach to Create and Curate Scientific Datasets with the SCILIRE System

About

The rapid growth of scientific literature has made manual extraction of structured knowledge increasingly impractical. To address this challenge, we introduce SCILIRE, a system for creating datasets from scientific literature. SCILIRE has been designed around Human-AI teaming principles centred on workflows for verifying and curating data. It facilitates an iterative workflow in which researchers can review and correct AI outputs. Furthermore, this interaction is used as a feedback signal to improve future LLM-based inference. We evaluate our design using a combination of intrinsic benchmarking outcomes together with real-world case studies across multiple domains. The results demonstrate that SCILIRE improves extraction fidelity and facilitates efficient dataset creation.

Necva B\"ol\"uc\"u, Jessica Irons, Changhyun Lee, Brian Jin, Maciej Rybinski, Huichen Yang, Andreas Duenser, Stephen Wan• 2026

Related benchmarks

TaskDatasetResultRank
Information ExtractionTDMS
Precision18.12
10
Information ExtractionSciREX
Precision14.52
10
Information ExtractionPolyIE
Precision17.24
10
Information ExtractionMPEA
Precision (P)39.59
10
Information ExtractionDiffusion
Precision27.51
6
Information ExtractionMMD
Precision36.18
6
Information ExtractionMRL
Precision4.9
6
Information ExtractionPNCExtract
Precision51.19
6
Information ExtractionBRENDA enzyme
Precision65.2
6
Information ExtractionBRENDA ribozyme
Precision30.75
6
Showing 10 of 50 rows

Other info

Follow for update