Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CapArena

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image CaptioningCapArena-Auto
Average Win-Rate Margin91.7
29
Open-ended GenerationCapArena
Average Score11.83
7
Image CaptioningCapArena
Claude Score41.67
6
Image CaptioningCapArena (test)
GPT-Score19.15
5
Showing 4 of 4 rows