Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM-AGGREFACT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Factuality CheckingLLM-AggreFact (test)
CNN Score72.5
16
Faithfulness Hallucination DetectionLLM-AggreFact Refined
Agg-CNN86.8
14
Factuality EvaluationLLM-AggreFact (test)
CNN Score69.9
13
Fact-CheckingLLM-AGGREFACT (test)
Cost ($)0.2
10
Showing 4 of 4 rows