Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

VidSitu

Benchmarks

Task NameDataset NameSOTA ResultTrend
Semantic Role PredictionVidSitu (test)
CIDEr84.85
17
Event relation predictionVidSitu
Mean Accuracy35.32
12
Verb predictionVidSitu (test)
Accuracy@144.67
7
Multimodal Event ExtractionVidSitu Aud
ET24.2
3
Video TrackingVidSitu
V-Trck23.2
3
Event RelationVidSitu
ER14.5
3
Event TypingVidSitu
ET22.3
3
Video TrackingVidSitu Txt
V-Trck34.4
3
Event RelationVidSitu Txt
Event Relation (ER)23.1
3
Event TypingVidSitu-Txt
ET Score32.8
3
Grounded Video Situation RecognitionVidSitu v1 (val)
Verb Accuracy@146.79
3
Grounded Video Situation RecognitionVidSitu (test)
Verb Acc@146.79
3
Showing 12 of 12 rows