Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Object Hallucination Evaluation on POPE Adversarial

85.89Accuracy

SpecEyes

-2.58820.3822543.352566.32275Nov 25, 2025Dec 14, 2025Jan 3, 2026Jan 23, 2026Feb 12, 2026Mar 4, 2026Mar 24, 2026
Updated 24d ago

Evaluation Results

MethodLinks
2026.03
85.89----1.78
2026.03
85.89----1.81
2026.03
85.87----1.77
2026.03
85.76----1.68
2026.03
85.13----2.06
2026.03
85.13----2.08
2026.03
85.13----2.13
2025.11
84.86---85.09-
2025.11
84.82---84.57-
2025.11
84.8---84.74-
2025.11
84.72---84.47-
2026.03
84.62----0.46
2025.11
84.6---84.82-
2026.03
83.97----1.89
2025.11
83.14---83-
2025.11
83.06---83.22-
2025.11
82.64---81.82-
2026.03
82.56----4.2
2025.11
82.31---81.4-
2026.03
81.32----1
2025.11
81.04---79.73-
2026.03
78.43----1
2025.11
77.02---79.88-
2025.11
77.01---80.28-
2025.11
73.44---78.26-
2025.11
72.61---77.38-
2025.11
72.38---77.64-
2025.11
71.95---77.03-
2025.11
71.92---77.03-
2025.11
71.85---77.04-
2025.11
71.58---77.23-
2025.11
70.83---76.46-
2025.11
70.08---76.41-
2025.11
69.59---75.94-
2026.03
49.1----0.38
2026.02
0.87412.8114.14.7--
2026.02
0.87112.6107.45.6--
2026.02
0.8712.6110.84.9--
2026.02
0.86513.2112.24.3--
2026.02
0.86113.7105.24.5--
2026.02
0.86113.6104.44.6--
2026.02
0.86113.6111.93.9--
2026.02
0.85813.9105.14.3--
2026.02
0.85813.8110.63.7--
2026.02
0.85614.1103.84.1--
2026.02
0.85414.3112.43.2--
2026.02
0.84814.9106.23.3--
2026.02
0.84814.9112.82.6--
2026.02
0.84814.9112.52.6--
2026.02
0.84315.4105.32.8--
2026.02
0.84215.4106.82.8--
2026.02
0.84115.6114.21.9--
2026.02
0.83616107.62.2--
2026.02
0.82217.5114.8---
2026.02
0.81518.2108.2---
2026.03
----75.85-
2026.03
----76.04-
2026.03
----75.89-
2026.03
----76.17-
2026.03
----75.42-
2026.03
----76.22-
2026.03
----77.32-
2026.03
----76.47-
2026.03
----75.76-
2026.03
----75.48-
2026.03
----77.04-
2026.03
----76.8-
2026.03
----78.36-
2026.03
----79.91-
2026.03
----78.93-
2026.03
----79.51-
2026.03
----79.14-
2026.03
----79.32-
2026.03
----78.7-
2026.03
----79.83-
2026.03
----79.46-