Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining

About

Tables store rich numerical data, but numerical reasoning over tables is still a challenge. In this paper, we find that the spreadsheet formula, which performs calculations on numerical values in tables, is naturally a strong supervision of numerical reasoning. More importantly, large amounts of spreadsheets with expert-made formulae are available on the web and can be obtained easily. FORTAP is the first method for numerical-reasoning-aware table pretraining by leveraging large corpus of spreadsheet formulae. We design two formula pretraining tasks to explicitly guide FORTAP to learn numerical reference and calculation in semi-structured tables. FORTAP achieves state-of-the-art results on two representative downstream tasks, cell type classification and formula prediction, showing great potential of numerical-reasoning-aware pretraining.

Zhoujun Cheng, Haoyu Dong, Ran Jia, Pengfei Wu, Shi Han, Fan Cheng, Dongmei Zhang• 2021

Related benchmarks

TaskDatasetResultRank
Hierarchical Table Question AnsweringHiTab
Accuracy47
11
Formula PredictionEnron (test)
Formula Score55.8
6
Cell Type ClassificationDeEx (test)
Macro Accuracy85.2
6
Table Question AnsweringHiTab (test)
Execution Accuracy47
5
Table Question AnsweringHiTab (dev)
Execution Accuracy47.1
5
Showing 5 of 5 rows

Other info

Code

Follow for update