LINE: LLM-based Iterative Neuron Explanations for Vision Models

About

Interpreting individual neurons in deep neural networks is a crucial step towards understanding their complex decision-making processes and ensuring AI safety. Despite recent progress in neuron labeling, existing methods often limit the search space to predefined concept vocabularies or produce overly specific descriptions that fail to capture higher-order, global concepts. We introduce LINE, a novel, training-free iterative approach tailored for open-vocabulary concept labeling in vision models. Operating in a strictly black-box setting, LINE leverages a large language model and a text-to-image generator to iteratively propose and refine concepts in a closed loop, guided by activation history. LINE achieves state-of-the-art performance across multiple model architectures, yielding AUC improvements of up to 0.11 on ImageNet and 0.05 on Places365, while discovering, on average, 27% of new concepts missed by predefined vocabularies. Beyond identifying the top concept, LINE provides a complete generation history, enabling polysemanticity evaluation and producing visual explanations that rival gradient-dependent activation maximization methods. The source code will be made available soon.

Vladimir Zaigrajew, Micha{\l} Piechota, Gaspar Sekula, Pawe{\l} Gelar, Przemys{\l}aw Biecek• 2026

Related benchmarks

Task	Dataset	Result
Neuron Interpretation	ImageNet CoSy benchmark avgpool layer 1k	AUC0.97	12
Concept Discovery	ImageNet	--	5
Neuron Interpretation	Places365 CoSy benchmark avgpool layer	AUC94	4
Concept Discovery	Places365	--	2

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord