Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factuality Evaluation on AggreFact-XSum (EXF)
Loading...
0.799
Balanced Accuracy
AlignScore
0.48804
0.56877
0.6495
0.73023
Mar 4, 2024
Balanced Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Balanced Accuracy
AlignScore
2024.03
0.799
FENICEGPT_claims
Extractor=LLM-based cl...
2024.03
0.735
FENICET5_claims
Extractor=Knowledge-di...
2024.03
0.707
ChatGPT-Star
Prompting=Scale-based
2024.03
0.706
ChatGPT-ZS
Prompting=Zero-shot
2024.03
0.692
TrueTeacher-11B
Parameters=11B
2024.03
0.684
ChatGPT-DA
Prompting=Direct Asses...
2024.03
0.656
SummaC-Cv
variant=Convolutional
2024.03
0.646
ChatGPT-CoT
Prompting=Chain-of-Tho...
2024.03
0.609
QuestEval
2024.03
0.601
MENLI
2024.03
0.597
QAFactEval
2024.03
0.596
SummaC-ZS
variant=Zero-shot
2024.03
0.514
Random Baseline
2024.03
0.5
Feedback
Search any
task
Search any
task