Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models

About

The next-token prediction (NTP) objective trains language models to predict a single token at each step, even though many continuations can express the same meaning. For example, in the sentence ``this sticker can be placed here'', positioned, attached, or put are all plausible alternatives. While standard NTP training treats these alternatives as mutually exclusive targets, we explore a self-supervised framework that encourages models to predict concepts, approximated as sets of semantically equivalent tokens. Models trained with this concept supervision align better with human similarity judgments, improve classification, clustering, and reranking performance, and achieve comparable or stronger downstream reasoning. These gains come with lower perplexity on semantically meaningful words (Section 3.2) and only minimal increases in global perplexity, suggesting that concepts enhance semantic alignment while preserving language modeling quality. Our code is available at https://anonymous.4open.science/r/learning-concepts-9025 .

Christine Zhang, Dan Jurafsky, Chen Shani• 2026

Related benchmarks

TaskDatasetResultRank
Semantic Textual SimilaritySTS-B
Spearman's Rho (x100)65.05
156
Word SimilarityWordSim-353
Spearman Rho0.3071
114
Word SimilarityMEN
Spearman Rho0.4362
74
Semantic SimilaritySimLex
Spearman Correlation0.1844
60
ClusteringOpenWebText
Clustering Score0.6222
30
ClusteringC4
Clustering Score63.95
30
Next-token predictionOpenWebText (held-out)
ID PPL18.53
30
Next-token predictionC4
OOD Perplexity21.1
30
Next-token predictionC4 (held-out)
Perplexity (PPL)21.5
30
Next-token predictionOpenWebText
PPL18.68
30
Showing 10 of 10 rows

Other info

Follow for update