LLMs can construct powerful representations and streamline sample-efficient supervised learning
About
As real-world datasets become increasingly complex and heterogeneous, supervised learning is often bottlenecked by input representation design. Modeling multimodal data for downstream tasks, such as time-series, free text, and structured records, often requires non-trivial domain-specific engineering. We propose an agentic pipeline to streamline this process. First, an LLM analyzes a small but diverse subset of text-serialized input examples in-context to synthesize a global rubric, which acts as a programmatic specification for extracting and organizing evidence. This rubric is then used to transform naive text-serializations of inputs into a more standardized format for downstream models. We also describe local rubrics, which are task-conditioned summaries generated by an LLM. Across 15 clinical tasks from the EHRSHOT benchmark, our rubric-based approaches significantly outperform traditional count-feature models, naive text-serialization-based LLM baselines, and a clinical foundation model, which is pretrained on orders of magnitude more data. Beyond performance, rubrics offer several advantages for operational healthcare settings such as being easy to audit, cost-effectiveness to deploy at scale, and they can be converted to tabular representations that unlock a swath of machine learning techniques.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Clinical prediction | EHRSHOT Overall 1.0 (test) | AUROC77.2 | 20 | |
| Clinical prediction | EHRSHOT Overall | AUPRC45.9 | 20 | |
| Clinical prediction | EHRSHOT Assignment of New Diag. | AUPRC23.6 | 20 | |
| Clinical prediction | EHRSHOT Anticipating Labs | AUPRC77.3 | 20 | |
| Lab Result Prediction | EHRSHOT Anticipating Labs | AUROC0.799 | 20 | |
| New Diagnosis Prediction | EHRSHOT Assignment of New Diagnoses | AUROC0.77 | 20 | |
| Operational Outcome Prediction | EHRSHOT Operational Outcomes | AUROC80.2 | 20 | |
| Clinical prediction | EHRSHOT Chest X-ray Findings | AUPRC60.5 | 20 | |
| Chest X-ray Finding Prediction | EHRSHOT Chest X-ray Findings | AUROC0.608 | 20 | |
| Multi-task clinical prediction suite (15 tasks) | EHRSHOT full-dataset | ICU Transfer AUROC80 | 1 |