Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Question Answering on TAT-QA (held-out)

59.33Accuracy

Claude-3.5-Sonnet

-1.967613.946229.8645.7738Sep 21, 2025
Updated 22d ago

Evaluation Results

MethodLinks
2025.09
59.33
2025.09
56.29
2025.09
56.16
2025.09
55.48
2025.09
53.85
2025.09
52.23
2025.09
51.68
2025.09
48.06
2025.09
46.5
2025.09
37.82
2025.09
30.44
2025.09
29.02
2025.09
26.42
2025.09
24.11
2025.09
20.85
2025.09
19.69
2025.09
15.67
2025.09
12.82
2025.09
12.31
2025.09
2.97
2025.09
0.39