Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TabXEval: Why this is a Bad Table? An eXhaustive Rubric for Table Evaluation

About

Evaluating tables qualitatively and quantitatively poses a significant challenge, as standard metrics often overlook subtle structural and content-level discrepancies. To address this, we propose a rubric-based evaluation framework that integrates multi-level structural descriptors with fine-grained contextual signals, enabling more precise and consistent table comparison. Building on this, we introduce TabXEval, an eXhaustive and eXplainable two-phase evaluation framework. TabXEval first aligns reference and predicted tables structurally via TabAlign, then performs semantic and syntactic comparison using TabCompare, offering interpretable and granular feedback. We evaluate TabXEval on TabXBench, a diverse, multi-domain benchmark featuring realistic table perturbations and human annotations. A sensitivity-specificity analysis further demonstrates the robustness and explainability of TabXEval across varied table tasks. Code and data are available at https://coral-lab-asu.github.io/tabxeval/

Vihang Pancholi, Jainit Bafna, Tejas Anvekar, Manish Shrivastava, Vivek Gupta• 2025

Related benchmarks

TaskDatasetResultRank
Metric Correlation AnalysisSynthetic perturbation sets (test)
Spearman's rho (S)80.27
17
Correlation with Human JudgmentsTabXBench Endurance 1.0 (test)
Spearman's Rho0.44
13
Evaluation Metric Correlation AnalysisReal-world text-to-table generation
Spearman's Rho0.24
9
Showing 3 of 3 rows

Other info

Code

Follow for update