Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Table Question Answering on TAT-QA (Execution Match)
Loading...
77.78
Execution Match (EM)
GPT-4o
-3.1112
17.8894
38.89
59.8906
Mar 24, 2025
Execution Match (EM)
Updated 21d ago
Evaluation Results
Method
Method
Links
Execution Match (EM)
GPT-4o
Model Type=Base Model
2025.03
77.78
Qwen2.5 14B
Model Type=Base Model
2025.03
15.47
Qwen2.5 14B
Model Type=FT_Syn-QA
2025.03
15.03
Qwen2.5-coder 14B
Model Type=FT_Syn-QA
2025.03
14.6
Qwen2.5-coder 14B
Model Type=Base Model
2025.03
14.37
Qwen2.5 3B
Model Type=FT_Syn-QA
2025.03
11.11
Qwen2.5-coder 3B
Model Type=FT_Syn-QA
2025.03
8.06
CodeLlama-Instruct 13B
Model Type=FT_Syn-QA
2025.03
7.41
Qwen2.5 3B
Model Type=Base Model
2025.03
6.75
Llama2 13B
Model Type=FT_Syn-QA
2025.03
6.54
Qwen2.5-coder 3B
Model Type=Base Model
2025.03
5.88
Llama2 7B
Model Type=FT_Syn-QA
2025.03
3.92
CodeLlama-Instruct 13B
Model Type=Base Model
2025.03
3.49
CodeLlama-Instruct 7B
Model Type=FT_Syn-QA
2025.03
2.18
Llama2 13B
Model Type=Base Model
2025.03
1.09
CodeLlama-Instruct 7B
Model Type=Base Model
2025.03
0.44
Llama2 7B
Model Type=Base Model
2025.03
0
Feedback
Search any
task
Search any
task