Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DART-ing Through the Drift: Dynamic Tracing of Knowledge Neurons for Adaptive Inference-Time Pruning

About

Large Language Models (LLMs) exhibit substantial parameter redundancy, particularly in Feed-Forward Networks (FFNs). Existing pruning methods suffer from two primary limitations. First, reliance on dataset-specific calibration introduces significant data dependency and computational overhead. Second, being predominantly static, they fail to account for the evolving subset of knowledge neurons in LLMs during autoregressive generation as the context evolves. To address this, we introduce DART, i.e., Dynamic Attention-Guided Runtime Tracing), a lightweight, training-free method that performs on-the-fly context-based pruning. DART monitors shifts in attention score distributions to infer context changes, dynamically updating neuron-level masks to retain salient parameters. Across ten benchmarks, DART outperforms prior dynamic baseline, achieving accuracy gains of up to 14.5% on LLAMA-3.1-8B at 70% FFN sparsity. Furthermore, DART achieves up to 3x better ROUGE-L scores with respect to static-masked pruning on summarization tasks, with its performance comparable to the original dense models. We conclusively demonstrate that the proposed framework effectively adapts to diverse semantic contexts, preserves model capabilities across both general and domain-specific tasks while running at less than 10MBs of memory for LLAMA-3.1-8B(16GBs) with 0.1% FLOPs overhead. The code is available at https://github.com/seeder-research/DART.

Abhishek Tyagi, Yunuo Cen, Shrey Dhorajiya, Bharadwaj Veeravalli, Xuanyao Fong• 2026

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningHellaSwag
Accuracy64.58
1460
Commonsense ReasoningWinoGrande
Accuracy65.98
776
Language UnderstandingMMLU
Accuracy34.14
756
Multitask Language UnderstandingMMLU (test)
Accuracy52.33
303
Question AnsweringOBQA
Accuracy36.8
276
Question AnsweringGPQA
Accuracy27.21
258
Medical Question AnsweringMedMCQA
Accuracy29.6
253
Question AnsweringARC-E
Accuracy59.43
242
Question AnsweringBoolQ
Accuracy66.2
240
Question AnsweringARC-C
Accuracy38.99
166
Showing 10 of 17 rows

Other info

Follow for update