Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SlowFast Networks for Video Recognition

About

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/facebookresearch/SlowFast

Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He• 2018

Related benchmarks

TaskDatasetResultRank
Action RecognitionSomething-Something v2 (val)
Top-1 Accuracy63.9
535
Action RecognitionKinetics-400
Top-1 Acc81.5
413
Action RecognitionUCF101
Accuracy92.8
365
Action RecognitionUCF101 (mean of 3 splits)
Accuracy96.8
357
Action RecognitionSomething-Something v2
Top-1 Accuracy63.1
341
Action RecognitionSomething-Something v2 (test)
Top-1 Acc63.1
333
Action RecognitionUCF101 (test)
Accuracy95.756
307
Action RecognitionSomething-something v1 (val)
Top-1 Acc51.2
257
Action RecognitionKinetics 400 (test)
Top-1 Accuracy79.8
245
Video ClassificationKinetics 400 (val)
Top-1 Acc79.8
204
Showing 10 of 192 rows
...

Other info

Code

Follow for update