| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| POPE | MIRROR | Accuracy94.42 | 2,019 | 23h ago | |
| MS-COCO (POPE Adversarial) | SIRA | Accuracy87.83 | 190 | 5d ago | |
| POPE Adversarial | LogicCheckGPT | Accuracy89.33 | 159 | 2d ago | |
| MS-COCO POPE (Popular) | AVISC | Accuracy90.76 | 158 | 2d ago | |
| CHAIR | CHAIRi Score57 | 154 | 2d ago | ||
| POPE Random | R-CoV | Accuracy94 | 152 | 2d ago | |
| MS-COCO POPE Random | AVISC | Accuracy92.36 | 121 | 2d ago | |
| POPE (test) | Accuracy90.6 | 107 | 1d ago | ||
| POPE (popular) | R-CoV | Accuracy92 | 96 | 7d ago | |
| POPE Adversarial offline | F1 Score68.96 | 84 | 3mo ago | ||
| POPE Popular offline | OPERA | F1 Score84.43 | 84 | 3mo ago | |
| POPE Random offline | F1 Score73.6 | 84 | 3mo ago | ||
| MSCOCO 2014 (val) | VDD-None | CHAIRs56.8 | 81 | 5d ago | |
| A-OKVQA POPE (Popular) | SECOND | Accuracy90.3 | 76 | 2d ago | |
| POPE A-OKVQA | VLI | Accuracy89.23 | 75 | 3mo ago | |
| MSCOCO POPE | DAC | Random Accuracy91.63 | 71 | 2d ago | |
| POPE GQA Popular | SECOND | Accuracy89.4 | 70 | 2d ago | |
| A-OKVQA POPE (Random) | SIRA | Accuracy92.1 | 60 | 2d ago | |
| POPE MSCOCO | MHSA | F1 Score93.97 | 60 | 19d ago | |
| POPE Random, Popular, Adversarial v1.0 | Random Score94.27 | 51 | 1mo ago | ||
| CHAIR MSCOCO v1.0 (val) | CHAIRs54.6 | 51 | 1mo ago | ||
| MSCOCO | MHSA | Accuracy93.87 | 43 | 19d ago | |
| CHAIR MSCOCO | Greedy | CS Score62 | 42 | 8d ago | |
| POPE (average across random and popular) | R-CoV | Accuracy (POPE)91.56 | 38 | 1mo ago | |
| MSCOCO | EAH | CHAIR Scene Score56.4 | 35 | 2mo ago |