Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMHE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image Captioning EvaluationMMHE (test)
Fluency Score51.3
15
Hallucination EvaluationMMHE
REG66.6
11
Image CaptioningMMHE User Study
Human Preference Count19
2
Visual Document UnderstandingMMHE User Study
Human Preference Count21
2
Visual Question AnsweringMMHE User Study
Human Preference Count12
2
Referring Expression GenerationMMHE User Study
Human Preference Count19
2
Showing 6 of 6 rows