Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Image Captioning

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image Captioning RobustnessImage Captioning Dataset
CLIP Score (RN-50)82.9
30
Image CaptioningImage Captioning Hard Criterion Claude-3.5
ASR7
8
Image CaptioningImage Captioning Hard Criterion GPT-o1
ASR53
8
Image CaptioningImage Captioning Hard Criterion GPT-4o
ASR74
8
Image CaptioningImage Captioning Target: Claude-4.5, Soft Criterion
ASR8
8
Image CaptioningImage Captioning Target Gemini-2.5 Soft Criterion
Accuracy Score Rate (ASR)79
8
Image CaptioningImage Captioning Target: GPT-5 Soft Criterion
ASR56
8
Image CaptioningImage Captioning Target: GPT-4o, Soft Criterion
ASR81
8
Showing 8 of 8 rows