Decomposed Prompting: A Modular Approach for Solving Complex Tasks

About

Few-shot prompting is a surprisingly powerful way to use Large Language Models (LLMs) to solve various tasks. However, this approach struggles as the task complexity increases or when the individual reasoning steps of the task themselves are hard to learn, especially when embedded in more complex tasks. To address this, we propose Decomposed Prompting, a new approach to solve complex tasks by decomposing them (via prompting) into simpler sub-tasks that can be delegated to a library of prompting-based LLMs dedicated to these sub-tasks. This modular structure allows each prompt to be optimized for its specific sub-task, further decomposed if necessary, and even easily replaced with more effective prompts, trained models, or symbolic functions if desired. We show that the flexibility and modularity of Decomposed Prompting allows it to outperform prior work on few-shot prompting using GPT3. On symbolic reasoning tasks, we can further decompose sub-tasks that are hard for LLMs into even simpler solvable sub-tasks. When the complexity comes from the input length, we can recursively decompose the task into the same task but with smaller inputs. We also evaluate our approach on textual multi-step reasoning tasks: on long-context multi-hop QA task, we can more effectively teach the sub-tasks via our separate sub-tasks prompts; and on open-domain multi-hop QA, we can incorporate a symbolic information retrieval within our decomposition framework, leading to improved performance on both tasks. Datasets, Code and Prompts available at https://github.com/allenai/DecomP.

Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal• 2022

Related benchmarks

Task	Dataset	Result
Natural Language Inference	RTE	Accuracy85.2	590
Multi-hop Question Answering	2WikiMultiHopQA (test)	--	226
Question Answering	HotpotQA (test)	--	37
Open-domain Question Answering	2WikiMultihopQA	EM59.3	16
Multi-hop Reasoning	CommaQA-E (test)	Exact Match64	15
Multi-hop Reasoning	CommaQA-E compositional	Exact Match73.5	15
Opinion Summarization	AmaSum product reviews (sports shoes) (test)	Coverage58	11
Opinion Summarization	SPACE hotels (test)	Coverage0.94	11
Meta-review generation	PeerSum (test)	Coverage84	11
Opinion Summarization	SPACE	Coverage93	9

Showing 10 of 23 rows

Other info

Follow for update

@wizwand_team Discord