Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Large Language Models are few(1)-shot Table Reasoners

About

Recent literature has shown that large language models (LLMs) are generally excellent few-shot reasoners to solve text reasoning tasks. However, the capability of LLMs on table reasoning tasks is yet to be explored. In this paper, we aim at understanding how well LLMs can perform table-related tasks with few-shot in-context learning. Specifically, we evaluated LLMs on popular table QA and fact verification datasets like WikiTableQuestion, FetaQA, TabFact, and FEVEROUS and found that LLMs are competent at complex reasoning over table structures, though these models are not pre-trained on any table corpus. When combined with `chain of thoughts' prompting, LLMs can achieve very strong performance with only a 1-shot demonstration, even on par with some SoTA models. We show that LLMs are even more competent at generating comprehensive long-form answers on FetaQA than tuned T5-large. We further manually studied the reasoning chains elicited from LLMs and found that these reasoning chains are highly consistent with the underlying semantic form. We believe that LLMs can serve as a simple yet generic baseline for future research. The code and data are released in https://github.com/wenhuchen/TableCoT.

Wenhu Chen• 2022

Related benchmarks

TaskDatasetResultRank
Table Question AnsweringWikiTQ (test)
Accuracy52.4
92
Table Question AnsweringWikiTableQuestions (test)--
86
Fact VerificationTabFact
Accuracy92.1
73
Table Fact VerificationTabFact small (test)
Accuracy0.7861
57
Table-based Fact VerificationTabFact
Accuracy73.1
33
Table Fact VerificationTabFact small
Overall Accuracy78.61
18
Table Fact VerificationTabFact full (test)
Simple Accuracy84.36
16
Table Fact VerificationTabFact full
Simple Accuracy84.36
16
Table Question AnsweringWikiTQ Large (>4000 tokens)
Accuracy35.1
8
Free-form Table Question AnsweringFeTaQA (100 randomly chosen samples)
Fluency0.96
6
Showing 10 of 11 rows

Other info

Follow for update