IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

About

Supervised Fine-Tuning (SFT) is used to specialize model behavior by training weights to produce intended target responses for queries. In contrast, In-Context Learning (ICL) adapts models during inference with instructions or demonstrations in the prompt. ICL can offer better generalizability and more calibrated responses compared to SFT in data scarce settings, at the cost of more inference compute. In this work, we ask the question: Can ICL's internal computations be used to improve the qualities of SFT? We first show that ICL and SFT produce distinct activation patterns, indicating that the two methods achieve adaptation through different functional mechanisms. Motivated by this observation and to use ICL's rich functionality, we introduce ICL Activation Alignment (IA2), a self-distillation technique which aims to replicate ICL's activation patterns in SFT models and incentivizes ICL-like internal reasoning. Performing IA2 as a priming step before SFT significantly improves the accuracy and calibration of model outputs, as shown by our extensive empirical results on 12 popular benchmarks and two model families. This finding is not only practically useful, but also offers a conceptual window into the inner mechanics of model adaptation.

Aayush Mishra, Daniel Khashabi, Anqi Liu• 2025

Related benchmarks

Task	Dataset	Result
Problem-Solving	GSM8K	Exact Match Accuracy77.4	20
Sentiment Classification	SST-2 (val)	Accuracy90.4	14
Financial Sentiment Analysis	FinS (val)	Accuracy82.4	8
Poem Sentiment Analysis	PoemS (val)	Accuracy68.4	8
Multi-token generation	GSM8K	Accuracy68.8	5
Multi-token generation	SciQ	Accuracy40.8	5
Multi-token generation	HMathA	Accuracy55.3	5
News Classification	AGN (val)	Accuracy31.8	4
Science Question Answering	QASCr (val)	Accuracy79.4	4
Science Question Answering	SciQr (val)	Accuracy91.7	4

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord