Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DID-Bench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image CaptioningDID-Bench GT-{GPT4-V}
BLEU-140.3
19
Image CaptioningDID-Bench GT-{LLaVA}
BLEU-142.84
19
Image CaptioningDID-Bench GT-LLaVA (test)
BLEU-139.93
15
Image CaptioningDID-Bench GT-GPT4-V 1.0 (test)
BLEU-136.83
15
Multimodal EvaluationDID-Bench
CLIP-S Score41.19
12
Image CaptioningDID-Bench
CIDEr3.31
4
Image CaptioningDID-Bench (val)
CIDEr-
0
Image CaptioningDID-Bench GT-LLaVA 1.0 (test)
BLEU-1-
0
Showing 8 of 8 rows