Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decomposed Prompting: A Modular Approach for Solving Complex Tasks

About

Few-shot prompting is a surprisingly powerful way to use Large Language Models (LLMs) to solve various tasks. However, this approach struggles as the task complexity increases or when the individual reasoning steps of the task themselves are hard to learn, especially when embedded in more complex tasks. To address this, we propose Decomposed Prompting, a new approach to solve complex tasks by decomposing them (via prompting) into simpler sub-tasks that can be delegated to a library of prompting-based LLMs dedicated to these sub-tasks. This modular structure allows each prompt to be optimized for its specific sub-task, further decomposed if necessary, and even easily replaced with more effective prompts, trained models, or symbolic functions if desired. We show that the flexibility and modularity of Decomposed Prompting allows it to outperform prior work on few-shot prompting using GPT3. On symbolic reasoning tasks, we can further decompose sub-tasks that are hard for LLMs into even simpler solvable sub-tasks. When the complexity comes from the input length, we can recursively decompose the task into the same task but with smaller inputs. We also evaluate our approach on textual multi-step reasoning tasks: on long-context multi-hop QA task, we can more effectively teach the sub-tasks via our separate sub-tasks prompts; and on open-domain multi-hop QA, we can incorporate a symbolic information retrieval within our decomposition framework, leading to improved performance on both tasks. Datasets, Code and Prompts available at https://github.com/allenai/DecomP.

Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal• 2022

Related benchmarks

TaskDatasetResultRank
Natural Language InferenceRTE
Accuracy85.2
367
Multi-hop Question Answering2WikiMultiHopQA (test)--
143
Question AnsweringHotpotQA (test)
Ans F153.5
37
Multi-hop ReasoningCommaQA-E (test)
Exact Match64
15
Multi-hop ReasoningCommaQA-E compositional
Exact Match73.5
15
Opinion SummarizationAmaSum product reviews (sports shoes) (test)
Coverage58
11
Opinion SummarizationSPACE hotels (test)
Coverage0.94
11
Meta-review generationPeerSum (test)
Coverage84
11
Meta-review summarizationPeerSum Research Articles
Coverage90
6
Meta-review summarizationAmaSum Sports Shoes
Coverage100
6
Showing 10 of 19 rows

Other info

Follow for update