Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?

About

Longitudinal NLP tasks require reasoning over temporally ordered text to detect persistence and change in human behavior and opinions. However, in-context learning with large language models struggles on tasks where models must integrate historical context, track evolving interactions, and handle rare change events. We introduce LiFT, a longitudinal instruction fine-tuning framework that unifies diverse longitudinal modeling tasks under a shared instruction schema. LiFT uses a curriculum that progressively increases temporal difficulty while incorporating few-shot structure and temporal conditioning to encourage effective use of past context. We evaluate LiFT across five datasets. Models trained on longitudinal tasks with different levels of temporal granularity are tested for generalisability on two separate datasets. Across models with different parameter sizes (OLMo (1B/7B), LLaMA-8B, and Qwen-14B), LiFT consistently outperforms base-model ICL, with strong gains on out-of-distribution data and minority change events.

Iqra Ali, Talia Tseriotou, Mahmud Elahi Akhter, Yuxiang Zhou, Maria Liakata• 2026

Related benchmarks

TaskDatasetResultRank
Longitudinal ClassificationAnnoMI (80/20)
Macro F1 Score52.6
24
Longitudinal ClassificationLRS (80/20)
Macro-F157.8
24
Longitudinal ClassificationTalkLife (80/20)
Macro F134.6
24
Longitudinal ClassificationReddit (test)
Macro F1 Score52.1
24
Longitudinal ClassificationCMV (test)
Macro-F157.7
24
Showing 5 of 5 rows

Other info

Follow for update