Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Object Hallucination Probing on OKVQA POPE Random
Loading...
86.03
Accuracy (Zh)
CLAIM
75.734
78.407
81.08
83.753
Jun 3, 2025
Accuracy (Zh)
Accuracy (En)
Accuracy (Es)
Accuracy (Ru)
Accuracy (Pt)
Accuracy (Bg)
Accuracy (Hi)
Accuracy (De)
Average Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy (Zh)
Accuracy (En)
Accuracy (Es)
Accuracy (Ru)
Accuracy (Pt)
Accuracy (Bg)
Accuracy (Hi)
Accuracy (De)
Average Accuracy
CLAIM
Backbone=Qwen-VL-Chat
2025.06
86.03
88.13
83.47
68.83
-
-
86.13
-
82.52
CLAIM
Backbone=LLaVA-1.5
2025.06
85.97
86.8
85.63
83.87
80.63
-
-
-
84.58
Baseline
Backbone=Qwen-VL-Chat
2025.06
85.1
88.47
67.7
74.63
-
-
59.53
75.17
72.43
VCD
Backbone=Qwen-VL-Chat
2025.06
84.3
-
72.23
73.67
-
-
49.07
75.33
70.92
VCD
Backbone=LLaVA-1.5
2025.06
77.93
-
68.17
72.17
75.67
70.33
-
-
72.85
Baseline
Backbone=LLaVA-1.5
2025.06
76.13
91
63.4
70.3
75.03
69.23
-
-
70.82
Feedback
Search any
task
Search any
task