Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

diagnostic

Benchmarks

Task NameDataset NameSOTA ResultTrend
Image-to-Text Retrievaldiagnostic (test-unseen)
Accuracy@5085.21
9
Image-to-Text Retrievaldiagnostic seen (test)
Acc@5084.97
9
Text-to-Image Retrievaldiagnostic (test-unseen)
Acc@5080.23
9
Text-to-Image Retrievaldiagnostic (test-seen)
Accuracy@5082.02
9
Consistency EvaluationDiagnostic (Avg. YouCook2, COIN, CrossTask) (test)
State Accuracy76.92
8
Consistent Video RetrievalDiagnostic Average of YouCook2, COIN, CrossTask
State Accuracy53.81
5
Showing 6 of 6 rows