Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Factual Error Detection on CHOCOLATE 1.0 (LLM)

73.8ROC AUC

GPT-4V

55.91260.55665.269.844Dec 15, 2023
Updated 4d ago

Evaluation Results

MethodLinks
2023.12
73.8
62.9
2023.12
61.7
2023.12
59.5
56.6