Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Object Hallucination Detection benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Object Hallucination Detection
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
MSCOCO
InsLen
AUROC
89.62
46
21d ago
Objects365
InsLen
AUROC
77.44
40
21d ago
POPE average over three sampling strategies
InsLen
AUROC
83.94
35
21d ago
CLEVR
InsLen
AUROC
77.72
35
21d ago
FOIL-COCO (test)
VC-Inspector-3B
Accuracy
99.6
25
1mo ago
Object Hall-Bench
LLaVA
Res Score
63
22
5d ago
MME existence
R-CoV
Accuracy
98.33
20
1mo ago
nocaps FOIL (Out-Domain)
EXPERT
AP
89.1
17
12d ago
nocaps-FOIL (Near-Domain)
EXPERT
AP
92.6
17
12d ago
nocaps FOIL In-Domain
EXPERT
AP
88.8
17
12d ago
nocaps-FOIL (Overall)
EXPERT
AP
91.1
17
12d ago
POPE MS-COCO Overall
BRACS
Accuracy
86.83
12
5d ago
POPE averaged across MS-COCO, A-OKVQA, and GQA (Adversarial)
VAF
Accuracy
0.807
12
3mo ago
POPE averaged across MS-COCO, A-OKVQA, and GQA (Popular)
VAF
Accuracy
85.2
12
3mo ago
POPE averaged across MS-COCO, A-OKVQA, and GQA (Random)
VAF
Accuracy
90.1
12
3mo ago
POPE
Qwen2-VL-7B
Accuracy
89.1
11
14d ago
LLaVA-Bench
AvisC
CHAIRs
59.4
10
9d ago
FOIL (test)
RefFLEUR
Accuracy
98.4
9
3mo ago
MSCOCO Average performance across VLMs (test)
Overthinking Score
AUC
87.33
8
2mo ago
MSCOCO Qwen3-VL 3 (test)
Overthinking Score
AUC
86.89
8
2mo ago
MSCOCO Gemma 3 (test)
Overthinking Score
AUC
85.59
8
2mo ago
MSCOCO LLaVA 1.5 (test)
Overthinking Score
AUC
89.73
8
2mo ago
AMBER out-of-distribution (OOD)
Overthinking Score
AUC
0.8611
8
2mo ago
POPE popular
RSP
F1 Score
86.5
6
6d ago
POPE random
RSP
F1 Score
89.1
6
6d ago
Showing 25 of 27 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs