NITP: Next Implicit Token Prediction for LLM Pre-training

About

Standard next-token prediction (NTP) supervises language models solely through discrete labels in the output logit space. We argue that this sparse one-hot supervision leaves the latent representation space under-constrained, allowing hidden states to drift into degenerate and anisotropic configurations that can limit generalization. To address this issue, we propose Next Implicit Token Prediction (NITP), which augments discrete prediction with dense continuous supervision directly in the representation space. NITP trains the model to predict the implicit semantic content of the next token, using shallow-layer representations from the same model as stable self-supervised targets. We provide theoretical analysis showing that NITP regularizes the optimization landscape by mitigating under-constrained degrees of freedom and encouraging a compact, structured representation geometry. Empirically, across dense and MoE models ranging from 0.5B to 9B parameters, NITP consistently improves downstream performance with negligible computational overhead. On a 9B MoE model, NITP achieves a 5.7% absolute improvement on MMLU-Pro, along with gains of 6.4% on C3 and 4.3% on CommonsenseQA, with approximately 2% additional training FLOPs and no additional inference cost. Our implementation is available at https://github.com/aHapBean/NITP.

Xiangdong Zhang, Debing Zhang, Shaofeng Zhang, Xiaohan Qin, Yu Cheng, Junchi Yan• 2026

Related benchmarks

Task	Dataset	Result
Reasoning	BBH	Accuracy29.4	770
Scientific Reasoning	ARC Challenge	Accuracy53.95	121
Language Modeling	LAMBADA	Accuracy64.49	114
Reading Comprehension	C3	Accuracy63.67	89
Chinese Language Understanding	C-Eval	Accuracy40.72	68
Language Understanding	CEval	Accuracy40.14	67
Natural Language Understanding	AGIEval	Accuracy35.11	46
Language Understanding	MMLU	Accuracy44.95	43
World Knowledge	MMLU	Accuracy46.14	39
Code Generation	LCB	Accuracy8	29

Showing 10 of 19 rows

Other info

GitHub

Follow for update

@wizwand_team Discord