Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factual Error Detection on CHOCOLATE 1.0 (LLM)
Loading...
73.8
ROC AUC
GPT-4V
55.912
60.556
65.2
69.844
Dec 15, 2023
ROC AUC
Updated 4d ago
Evaluation Results
Method
Method
Links
ROC AUC
GPT-4V
2023.12
73.8
DePlot + GPT-4
2023.12
62.9
Bard
2023.12
61.7
CHARTVE
2023.12
59.5
LLaVA-1.5-13B
2023.12
56.6
Feedback
Search any
task
Search any
task