Using Captum to Explain Generative Language Models

About

Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users' understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

Vivek Miglani, Aobo Yang, Aram H. Markosyan, Diego Garcia-Olano, Narine Kokhlikyan• 2023

Related benchmarks

Task	Dataset	Result
Faithfulness Measurement	MHC	BLEU68.8	18
Faithfulness Measurement	tldr_news	BLEU75.9	12
Faithfulness Measurement	Alpaca	BLEU0.515	12
Explanation Generation	Alpaca avg prompt instance	Inference Time (s)1.17e+3	2
Explanation Generation	tldr_news avg prompt instance	Latency (s)1.73e+3	2
Explanation Generation	MHC avg prompt instance	Time (s)1.81e+3	2

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord