Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MMHal

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hallucination EvaluationMMHal
Score4
18
Pointwise ScoringMMHal pointwise
Kendall's Tau0.949
9
Hallucination EvaluationMMHal v1.0 (test)
Score2.23
6
Showing 3 of 3 rows