Improving LLM Predictions via Inter-Layer Structural Encoders
About
The standard practice in Large Language Models (LLMs) is to base predictions on the final-layer token representations. Recent studies, however, show that intermediate layers encode substantial information, which may contain more task-relevant features than the final-layer representations alone. Importantly, it was shown that for different tasks, different layers may be optimal. In this work we introduce Inter-Layer Structural Encoders (ILSE), a powerful structural approach to learn one effective representation from the LLM's internal layer representations all together. Central to ILSE is Cayley-Encoder, a mathematically grounded geometric encoder that leverages expander Cayley graphs for efficient inter-layer information propagation. We evaluate ILSE across 13 classification and semantic similarity tasks with 9 pre-trained LLMs ranging from 14 million to 8 billion parameters. ILSE consistently outperforms baselines and existing approaches, achieving up to 44% improvement in accuracy and 25% in similarity metrics. We further show that ILSE is data-efficient in few-shot regimes and can make small LLMs competitive with substantially larger models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Intent Classification | Banking77 (test) | Accuracy92.85 | 184 | |
| Semantic Textual Similarity | STS Benchmark (test) | Pearson Correlation (r)0.6305 | 46 | |
| Semantic Textual Similarity | STS14 (test) | Spearman Correlation0.7017 | 42 | |
| Semantic Textual Similarity | STS15 (test) | Spearman Correlation0.7696 | 42 | |
| Semantic Textual Similarity | STS13 (test) | Spearman Correlation63.98 | 42 | |
| Semantic Textual Similarity | STS16 (test) | Spearman Corr63.13 | 42 | |
| Text Classification | Emotion (test) | Accuracy79.9 | 38 | |
| Classification | MTOP Domain (test) | Accuracy99.16 | 33 | |
| Classification | MTOPIntent (test) | Accuracy96.46 | 33 | |
| Classification | PoemSentiment (test) | Accuracy83.27 | 33 |