Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FinQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Financial Question AnsweringFinQA (test)
Accuracy76.05
57
Financial ReasoningFinQA
Accuracy77.6
33
Numerical Question AnsweringFinQA (test)
Execution Accuracy91.16
33
Financial Question AnsweringFinQA
Accuracy80.47
16
Financial Open-ended QAFinQA (test)
Token Accuracy29.67
16
Financial Open-ended Question AnsweringFinQA (test)
Token Perplexity3.9697
16
Numerical Question AnsweringFinQA 1.0 (test)
Execution Accuracy91.16
14
Proactive information probingFinQA
PC4.8
12
Question AnsweringFinQA (val)
Execution Accuracy0.6122
10
Question AnsweringFinQA
Prog Acc59.37
9
RAG Poisoning Attack (Document-Level Targeting)FinQA
RSR@547.1
7
Fact-Level RAG Poisoning AttackFinQA
RSR@599.8
7
Numerical Reasoning Question AnsweringFinQA v1 (dev)
Execution Accuracy72.91
7
Fact RetrievalFinQA (test)
Recall@393.31
7
Fact RetrievalFinQA (dev)
R@395.03
7
Multi-step Reasoning over Code DependenciesFinQA hard
Accuracy65.56
6
RetrievalFinQA (test)
Exact Match (EM)62.6
5
RetrievalFinQA (evaluation)
Exact Match (EM)0.623
5
Table-QAFinQA
Quality Score26.1
4
Table Question AnsweringFinQA (dev)
Accuracy59
4
Question AnsweringFinQA
EM66.2
2
Showing 21 of 21 rows