Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Object Hallucination Probing on OKVQA POPE Random
Loading...
86.03
Accuracy (Zh)
CLAIM
75.734
78.407
81.08
83.753
Jun 3, 2025
Accuracy (Zh)
Accuracy (En)
Accuracy (Es)
Accuracy (Ru)
Accuracy (Pt)
Accuracy (Bg)
Accuracy (Hi)
Accuracy (De)
Average Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Zh)
Accuracy (En)
Accuracy (Es)
Accuracy (Ru)
Accuracy (Pt)
Accuracy (Bg)
Accuracy (Hi)
Accuracy (De)
Average Accuracy
CLAIM
Backbone=Qwen-VL-Chat
2025.06
86.03
88.13
83.47
68.83
-
-
86.13
-
82.52
CLAIM
Backbone=LLaVA-1.5
2025.06
85.97
86.8
85.63
83.87
80.63
-
-
-
84.58
Baseline
Backbone=Qwen-VL-Chat
2025.06
85.1
88.47
67.7
74.63
-
-
59.53
75.17
72.43
VCD
Backbone=Qwen-VL-Chat
2025.06
84.3
-
72.23
73.67
-
-
49.07
75.33
70.92
VCD
Backbone=LLaVA-1.5
2025.06
77.93
-
68.17
72.17
75.67
70.33
-
-
72.85
Baseline
Backbone=LLaVA-1.5
2025.06
76.13
91
63.4
70.3
75.03
69.23
-
-
70.82
Feedback
Search any
task
Search any
task