Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations

About

Machine Learning (ML) models are increasingly used to make critical decisions in real-world applications, yet they have become more complex, making them harder to understand. To this end, researchers have proposed several techniques to explain model predictions. However, practitioners struggle to use these explainability techniques because they often do not know which one to choose and how to interpret the results of the explanations. In this work, we address these challenges by introducing TalkToModel: an interactive dialogue system for explaining machine learning models through conversations. Specifically, TalkToModel comprises of three key components: 1) a natural language interface for engaging in conversations, making ML model explainability highly accessible, 2) a dialogue engine that adapts to any tabular model and dataset, interprets natural language, maps it to appropriate explanations, and generates text responses, and 3) an execution component that constructs the explanations. We carried out extensive quantitative and human subject evaluations of TalkToModel. Overall, we found the conversational system understands user inputs on novel datasets and models with high accuracy, demonstrating the system's capacity to generalize to new situations. In real-world evaluations with humans, 73% of healthcare workers (e.g., doctors and nurses) agreed they would use TalkToModel over baseline point-and-click systems for explainability in a disease prediction task, and 85% of ML professionals agreed TalkToModel was easier to use for computing explanations. Our findings demonstrate that TalkToModel is more effective for model explainability than existing systems, introducing a new category of explainability tools for practitioners. Code & demo released here: https://github.com/dylan-slack/TalkToModel.

Dylan Slack, Satyapriya Krishna, Himabindu Lakkaraju, Sameer Singh• 2022

Related benchmarks

TaskDatasetResultRank
Conversational XAICOMPAS
Faithfulness68
12
Conversational XAIGerman
Faithfulness62
12
Conversational XAIDiabetes
Faithfulness67
12
Explanation GenerationCOMPAS
PPL4.5
7
Explanation GenerationDiabetes
PPL3.71
7
Explanation Generationfifa
PPL5.05
7
Explanation GenerationCredit
PPL4.3
7
Explanation Generationstroke
PPL4.14
7
Explanation Generationstudent
PPL4.56
7
Showing 9 of 9 rows

Other info

Follow for update