Improving LLM Final Representations with Inter-Layer Geometry

About

The standard in LLM-based prediction is to use the final-layer representation as the input to a downstream predictor. However, intermediate layers may encode complementary task-relevant signals. Existing approaches therefore either search for the best layer for each task or apply expensive attention-based mechanisms to learn inter-layer aggregation. In this work, we first show that such complexity is unnecessary: a lightweight Graph Neural Network over a fully connected graph of LLM layers is more efficient and achieves significantly stronger predictive performance than existing approaches. We then introduce the Cayley-Encoder, which further improves both efficiency and predictive performance by replacing the fully connected graph with a Cayley graph over SL(2, Zn). These Cayley graphs provide a mathematically grounded topology that is sparse, regular by construction, and has low diameter. This enables effective communication across layers while constraining the aggregation structure and reducing the risk of GNN overfitting. In an evaluation of Cayley-Encoder across 13 tasks and 9 LLMs, Cayley-Encoder consistently outperforms baselines, achieving improvements of up to 40 percentage points in accuracy, while introducing at most 0.1% additional parameters relative to the LLM size. We further show that Cayley-Encoder is effective in few-shot regimes. Finally, we show that Cayley-Encoder outperforms LoRA fine-tuning while operating on the frozen LLM. We conclude with an explainability analysis showing that multiple layers contribute meaningfully to the final prediction, supporting our hypothesis.

Tom Ulanovski, Eyal Blyachman, Maya Bechler-Speicher• 2026

Related benchmarks

Task	Dataset	Result
Intent Classification	Banking77 (test)	Accuracy92.85	196
Semantic Textual Similarity	STS Benchmark (test)	Pearson Correlation (r)0.6305	46
Semantic Textual Similarity	STS14 (test)	Spearman Correlation0.7017	42
Semantic Textual Similarity	STS15 (test)	Spearman Correlation0.7696	42
Semantic Textual Similarity	STS13 (test)	Spearman Correlation63.98	42
Semantic Textual Similarity	STS16 (test)	Spearman Corr63.13	42
Text Classification	Emotion (test)	Accuracy79.9	38
Classification	MTOP Domain (test)	Accuracy99.16	33
Classification	MTOPIntent (test)	Accuracy96.46	33
Classification	PoemSentiment (test)	Accuracy83.27	33

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord