Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact-checking Explanation Generation on Combined Datasets (Overall)
Loading...
73
Helpfulness Score
CLUE
26.304
38.427
50.55
62.673
May 23, 2025
Helpfulness Score
Consistency Score
Non-redundancy Score
Coverage Score
Overall Quality Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Helpfulness Score
Consistency Score
Non-redundancy Score
Coverage Score
Overall Quality Score
CLUE
Backbone=Qwen2.5-14B-I...
2025.05
73
72.1
76.2
72.2
67.8
PromptBaseline
Backbone=Qwen2.5-14B-I...
2025.05
28.1
29
25.2
27.5
32.5
Feedback
Search any
task
Search any
task