Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-form Question Answering on FetaQA (test)
Loading...
32.61
BLEU
Chain-of-Table
24.602
26.681
28.76
30.839
Jan 31, 2025
BLEU
ROUGE-1
ROUGE-2
ROUGE-L
Updated 2d ago
Evaluation Results
Method
Method
Links
BLEU
ROUGE-1
ROUGE-2
ROUGE-L
Chain-of-Table
Backbone=PaLM 2
2025.01
32.61
66
44
56
Fine-Tuning
Backbone=T5-large
2025.01
30.54
63
41
53
Dater
Backbone=PaLM 2
2025.01
29.47
63
41
53
TableMaster
Backbone=gpt-4o
2025.01
28.94
66.06
45.29
54.56
End-to-End QA
Backbone=PaLM 2
2025.01
28.37
63
41
53
End-to-End QA
Backbone=Codex
2025.01
27.96
62
40
52
End-to-End QA
Backbone=gpt-4o
2025.01
24.91
62.05
41.29
50.36
Feedback
Search any
task
Search any
task