| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| POPE | MIRROR | Accuracy94.42 | 1,455 | 2d ago | |
| MS-COCO (POPE Adversarial) | AVISC | Accuracy87.62 | 138 | 8d ago | |
| CHAIR | CS Score59.4 | 108 | 4d ago | ||
| MS-COCO POPE (Popular) | AVISC | Accuracy90.76 | 108 | 8d ago | |
| POPE Adversarial offline | F1 Score68.96 | 84 | 1mo ago | ||
| POPE Popular offline | OPERA | F1 Score84.43 | 84 | 1mo ago | |
| POPE Random offline | F1 Score73.6 | 84 | 1mo ago | ||
| POPE (test) | Accuracy90.6 | 79 | 3d ago | ||
| POPE A-OKVQA | VLI | Accuracy89.23 | 75 | 1mo ago | |
| MS-COCO POPE Random | AVISC | Accuracy92.36 | 71 | 8d ago | |
| POPE Adversarial | SpecEyes | Accuracy85.89 | 55 | 24d ago | |
| POPE MSCOCO | VLI | Accuracy92.58 | 55 | 1mo ago | |
| MSCOCO 2014 (val) | CHAIRs54.6 | 55 | 1mo ago | ||
| A-OKVQA POPE (Popular) | SECOND | Accuracy90.3 | 52 | 18d ago | |
| POPE Random, Popular, Adversarial v1.0 | Random Score94.27 | 51 | 11d ago | ||
| CHAIR MSCOCO v1.0 (val) | CHAIRs54.6 | 51 | 11d ago | ||
| MSCOCO POPE | DAC | Random Accuracy91.63 | 47 | 3d ago | |
| POPE GQA Popular | SECOND | Accuracy89.4 | 46 | 18d ago | |
| MSCOCO | VACoDe | Accuracy88.97 | 41 | 1mo ago | |
| A-OKVQA POPE (Random) | HDD | Accuracy89.5 | 36 | 1mo ago | |
| MSCOCO | EAH | CHAIR Scene Score56.4 | 35 | 18d ago | |
| POPE GQA (test) | LLaVA-1.5 + HIRE | Average Accuracy84.72 | 29 | 17d ago | |
| GQA (Random) | MESA | Accuracy89.5 | 28 | 8d ago | |
| MSCOCO (Random) | HDD | Accuracy91.5 | 28 | 8d ago | |
| MSCOCO CHAIR | LLaVA-1.5-7B + Vissink | CHAIR_S52.4 | 27 | 1mo ago |