LLMs can construct powerful representations and streamline sample-efficient supervised learning

About

As real-world datasets become more complex and heterogeneous, supervised learning is often bottlenecked by input representation design. Modeling multimodal data, such as time-series, free text, and structured records, often requires non-trivial domain expertise. We propose an agentic pipeline to streamline this process. First, an LLM analyzes a small but diverse subset of text-serialized input examples in-context to synthesize a global rubric, which acts as a programmatic specification for extracting and organizing evidence. This rubric is then used to transform naive text-serializations of inputs into a more standardized format for downstream models. We also describe local rubrics, which are task-conditioned interpretive summaries generated by an LLM. Across 15 clinical tasks from the EHRSHOT benchmark, our rubric approaches significantly outperform count-feature models, naive LLM baselines, and a clinical foundation model pretrained on orders of magnitude more data. Beyond performance, rubrics offer operational advantages such as being easy to audit, cost-effectiveness at scale, and facilitating tabular representations.

Ilker Demirel, Lawrence Shi, Zeshan Hussain, David Sontag• 2026

Related benchmarks

Task	Dataset	Result
Clinical prediction	EHRSHOT Overall 1.0 (test)	AUROC77.2	20
Clinical prediction	EHRSHOT Overall	AUPRC45.9	20
Clinical prediction	EHRSHOT Assignment of New Diag.	AUPRC23.6	20
Clinical prediction	EHRSHOT Anticipating Labs	AUPRC77.3	20
Lab Result Prediction	EHRSHOT Anticipating Labs	AUROC0.799	20
New Diagnosis Prediction	EHRSHOT Assignment of New Diagnoses	AUROC0.77	20
Operational Outcome Prediction	EHRSHOT Operational Outcomes	AUROC80.2	20
Clinical prediction	EHRSHOT Chest X-ray Findings	AUPRC60.5	20
Chest X-ray Finding Prediction	EHRSHOT Chest X-ray Findings	AUROC0.608	20
Multi-task clinical prediction suite (15 tasks)	EHRSHOT full-dataset	ICU Transfer AUROC80	1

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord