Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLMs can construct powerful representations and streamline sample-efficient supervised learning

About

As real-world datasets become more complex and heterogeneous, supervised learning is often bottlenecked by input representation design. Modeling multimodal data, such as time-series, free text, and structured records, often requires non-trivial domain expertise. We propose an agentic pipeline to streamline this process. First, an LLM analyzes a small but diverse subset of text-serialized input examples in-context to synthesize a global rubric, which acts as a programmatic specification for extracting and organizing evidence. This rubric is then used to transform naive text-serializations of inputs into a more standardized format for downstream models. We also describe local rubrics, which are task-conditioned interpretive summaries generated by an LLM. Across 15 clinical tasks from the EHRSHOT benchmark, our rubric approaches significantly outperform count-feature models, naive LLM baselines, and a clinical foundation model pretrained on orders of magnitude more data. Beyond performance, rubrics offer operational advantages such as being easy to audit, cost-effectiveness at scale, and facilitating tabular representations.

Ilker Demirel, Lawrence Shi, Zeshan Hussain, David Sontag• 2026

Related benchmarks

TaskDatasetResultRank
Clinical predictionEHRSHOT Overall 1.0 (test)
AUROC77.2
20
Clinical predictionEHRSHOT Overall
AUPRC45.9
20
Clinical predictionEHRSHOT Assignment of New Diag.
AUPRC23.6
20
Clinical predictionEHRSHOT Anticipating Labs
AUPRC77.3
20
Lab Result PredictionEHRSHOT Anticipating Labs
AUROC0.799
20
New Diagnosis PredictionEHRSHOT Assignment of New Diagnoses
AUROC0.77
20
Operational Outcome PredictionEHRSHOT Operational Outcomes
AUROC80.2
20
Clinical predictionEHRSHOT Chest X-ray Findings
AUPRC60.5
20
Chest X-ray Finding PredictionEHRSHOT Chest X-ray Findings
AUROC0.608
20
Multi-task clinical prediction suite (15 tasks)EHRSHOT full-dataset
ICU Transfer AUROC80
1
Showing 10 of 10 rows

Other info

Follow for update