Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

StructGPT: A General Framework for Large Language Model to Reason over Structured Data

About

In this paper, we study how to improve the zero-shot reasoning ability of large language models~(LLMs) over structured data in a unified way. Inspired by the study on tool augmentation for LLMs, we develop an \emph{Iterative Reading-then-Reasoning~(IRR)} approach for solving question answering tasks based on structured data, called \textbf{StructGPT}. In our approach, we construct the specialized function to collect relevant evidence from structured data (\ie \emph{reading}), and let LLMs concentrate the reasoning task based on the collected information (\ie \emph{reasoning}). Specially, we propose an \emph{invoking-linearization-generation} procedure to support LLMs in reasoning on the structured data with the help of the external interfaces. By iterating this procedures with provided interfaces, our approach can gradually approach the target answer to a given query. Extensive experiments conducted on three types of structured data demonstrate the effectiveness of our approach, which can significantly boost the performance of ChatGPT and achieve comparable performance against the full-data supervised-tuning baselines. Our codes and data are publicly available at~\url{https://github.com/RUCAIBox/StructGPT}.

Jinhao Jiang, Kun Zhou, Zican Dong, Keming Ye, Wayne Xin Zhao, Ji-Rong Wen• 2023

Related benchmarks

TaskDatasetResultRank
Multi-task Language UnderstandingMMLU
Accuracy61.4
842
Language UnderstandingMMLU
Accuracy83.41
756
Question AnsweringARC Challenge
Accuracy83.28
749
ReasoningBBH
Accuracy71.93
507
Question AnsweringARC Easy
Normalized Acc86.87
385
Reading ComprehensionRACE high
Accuracy79.87
295
Reading ComprehensionRACE mid
Accuracy81.27
196
Knowledge Base Question AnsweringWEBQSP (test)
Hit@172.6
143
Question AnsweringCommonsenseQA
Accuracy71.58
143
Commonsense ReasoningCommonsenseQA
Accuracy59.3
132
Showing 10 of 32 rows

Other info

Follow for update