Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
About
Context faithfulness is essential for reliable reasoning in context-dependent scenarios. However, large language models often struggle to ground their outputs in the provided context, resulting in irrelevant responses. Inspired by the emergent expert specialization observed in mixture-of-experts architectures, this work investigates whether certain experts exhibit specialization in context utilization, offering a potential pathway toward targeted optimization for improved context faithfulness. To explore this, we propose Router Lens, a method that accurately identifies context-faithful experts. Our analysis reveals that these experts progressively amplify attention to relevant contextual information, thereby enhancing context grounding. Building on this insight, we introduce Context-faithful Expert Fine-Tuning (CEFT), a lightweight optimization approach that selectively fine-tunes context-faithful experts. Experiments across a wide range of benchmarks and models demonstrate that CEFT matches or surpasses the performance of full fine-tuning while being significantly more efficient.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Hallucination Evaluation | POPE | Accuracy87.1 | 2019 | |
| Diagram Understanding | AI2D | Accuracy66.4 | 317 | |
| Multimodal Understanding | MMMU (val) | -- | 199 | |
| Multi-modal Evaluation | MME | MME Score1.51e+3 | 160 | |
| Multimodal Benchmarking | MMBench | Accuracy74.9 | 90 | |
| Multi-modal Reasoning | MMVet | Score43.5 | 18 | |
| Image Understanding | SEED-IMG | Accuracy71.8 | 17 |