Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Semantic Consistency Evaluation on TIFA

93.8Avg Answering Accuracy

VisualPrompter

49.18460.76772.3583.933Jun 29, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
93.883
2025.06
93.783
2025.06
89.874.6
2025.06
87.978.4
2025.06
86.976.2
2025.06
85.778.4
2025.06
85.376.2
2025.06
8474.6
2025.06
82.883
2025.06
80.783
2025.06
80.478.4
2025.06
77.276.2
2025.06
75.574.6
2025.06
7578.4
2025.06
71.476.2
2025.06
62.874.6
2025.06
6253.5
2025.06
57.453.5
2025.06
53.653.5
2025.06
50.953.5