Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fact-checking on AggreFact Xsum

76.4Balanced Accuracy

GPT-4o

52.27258.53664.871.064Feb 23, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.02
76.4
2025.02
75.7
2025.02
75.4
2025.02
74.8
2025.02
74.6
2025.02
72.9
2025.02
72.1
2025.02
70.8
2025.02
68
2025.02
67.6
60.1
55.6
2025.02
55.5
2025.02
53.4
2025.02
53.2