| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| POPE Popular | ORCA | F1 Score93.01 | 372 | 2d ago | |
| POPE Adversarial | LogicCheckGPT | Accuracy90 | 353 | 2d ago | |
| POPE (Random) | F1 Score93.02 | 324 | 2d ago | ||
| POPE | Qwen3-VL-235B | Accuracy90.51 | 51 | 13d ago | |
| MSCOCO 500 images 2014 (val) | Consistency Score (CS)60.6 | 50 | 3mo ago | ||
| POPE Adversarial v1.0 | Active-Look | Accuracy89.26 | 45 | 1mo ago | |
| COCO captions 2014 (val) | CHAIR (scene)12.3 | 35 | 3mo ago | ||
| MSCOCO POPE (test) | HGAI | Accuracy (Random)90.7 | 32 | 2mo ago | |
| A-OKVQA POPE (test) | OSGA | Accuracy (Random)90.13 | 29 | 2mo ago | |
| COCO 512-token budget (test) | VCD | Consistency Score62.9 | 24 | 3mo ago | |
| COCO 64-token budget (test) | VCD | CS0.328 | 24 | 3mo ago | |
| POPE Popular v1.0 | ONLY + VDC | Accuracy88.03 | 24 | 3mo ago | |
| POPE v1.0 (Random) | ONLY + VDC | Accuracy90.07 | 24 | 3mo ago | |
| MME | ICT | E Score195 | 22 | 3mo ago | |
| POPE average across COCO, A-OKVQA, GQA | AFTER | ACC85.7 | 22 | 3mo ago | |
| MSCOCO (test) | Nullu | Accuracy79.52 | 21 | 3mo ago | |
| POPE | RC-DPO | Accuracy (Random)90.73 | 12 | 6d ago | |
| POPE (test) | InfiMM-HD | POPE Score87.9 | 12 | 3mo ago | |
| POPE (test) | POPE Score86.3 | 3 | 3mo ago | ||
| Hallucination in Captioning | CHi3.2 | 3 | 3mo ago |