Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Object Hallucination Evaluation on POPE GQA (Adversarial)

84.27Accuracy

SIRA

-2.66120819.90747142.4761565.044829Dec 29, 2025Jan 20, 2026Feb 12, 2026Mar 7, 2026Mar 29, 2026Apr 21, 2026May 14, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.05
84.27-85.18
2026.05
84.1-83.53
2026.05
83.25-84.3
2026.05
82-81.51
2026.05
80.24-80.64
2026.05
80.01-80.75
2026.05
79.6-81.17
2026.05
79.58-79.88
2026.05
78.77-78.56
2026.05
77.4-80.11
2026.05
77.1-79.3
2026.05
76.62-78.99
2026.05
76.09-78.78
2026.05
75.08-76.06
2026.05
75-78.71
2026.05
68.73-74.78
2025.12
0.81690.84960.8167
2025.12
0.81130.84180.8057
2025.12
0.81070.83290.8041
2025.12
0.81030.82930.8094
2025.12
0.80870.81070.808
2025.12
0.79030.80430.7854
2025.12
0.75210.68340.7976
2025.12
0.74070.67420.7822
2025.12
0.73250.69680.7687
2025.12
0.72450.68520.7532
2025.12
0.71170.65790.7536
2025.12
0.710.65750.7514
2025.12
0.70170.64760.7478
2025.12
0.68830.62260.7543
2025.12
0.68670.62160.7528
2025.12
0.6860.62430.7484
2025.12
0.6860.63940.731
2025.12
0.68230.61750.751