| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Image Captioning Evaluation | MMHE (test) | Fluency Score51.3 | 15 | |
| Hallucination Evaluation | MMHE | REG66.6 | 11 | |
| Image Captioning | MMHE User Study | Human Preference Count19 | 2 | |
| Visual Document Understanding | MMHE User Study | Human Preference Count21 | 2 | |
| Visual Question Answering | MMHE User Study | Human Preference Count12 | 2 | |
| Referring Expression Generation | MMHE User Study | Human Preference Count19 | 2 |