Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates

About

Recent findings reveal that much of the knowledge in a Transformer-based Large Language Model (LLM) is encoded in its feed-forward (FFN) layers, where each FNN layer can be interpreted as the summation of sub-updates, each corresponding to a weighted column vector from the FFN's value parameter matrix that often encodes human-interpretable concepts. In light of this, we hypothesize that model performance and behaviors can be further enhanced and controlled by modulating the contributions of these sub-updates based on their relevance to the input or target output style, and propose LLMBRACES, a novel and efficient method that computes relevance scores associated with value vectors in FFN layers and leverages these scores to dynamically adjust the contribution of sub-updates. By optimizing sub-update contributions, LLMBRACES refines the prediction process, leading to more accurate and reliable outputs, much like a 'brace' providing support and stability. Moreover, LLMBRACES can be extended to support conditional control over generation characteristics, such as sentiment, thereby offering fine-grained steering of LLM outputs. Extensive experiments on various LLMs-including Qwen2.5-1.5B, Llama2-7B, and Llama3-8B-demonstrate that LLMBRACES outperforms baseline approaches in both fine-tuning and zero-shot settings while requiring significantly fewer tunable parameters, up to 75% fewer compared to LoRA. Furthermore, LLMBRACES excels in sentiment-controlled generation and toxicity reduction, highlighting its potential for flexible, controlled text generation across applications.

Ying Shen, Lifu Huang• 2025

Related benchmarks

TaskDatasetResultRank
Question AnsweringPopQA
Accuracy36.21
186
Commonsense ReasoningCommonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test)
BoolQ Accuracy74.4
138
Question AnsweringTruthfulQA--
82
Question AnsweringNatural Questions (NQ)
Accuracy20.3
36
Trivia QATrivia QA
Accuracy66.11
32
Sentiment SteeringOpenWebText Neutral to Positive (test)
Perplexity (PPL)30.03
27
Sentiment SteeringOpenWebText Neutral to Negative (test)
Perplexity (PPL)39.78
27
Question AnsweringAGIEval
Accuracy32.11
12
Toxic Language SuppressionRealToxicityPrompts 10K nontoxic prompts GPT2-large generation (test)
Max Toxicity0.172
7
Showing 9 of 9 rows

Other info

Follow for update