SlowFast Networks for Video Recognition

About

We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway, operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating at high frame rate, to capture motion at fine temporal resolution. The Fast pathway can be made very lightweight by reducing its channel capacity, yet can learn useful temporal information for video recognition. Our models achieve strong performance for both action classification and detection in video, and large improvements are pin-pointed as contributions by our SlowFast concept. We report state-of-the-art accuracy on major video recognition benchmarks, Kinetics, Charades and AVA. Code has been made available at: https://github.com/facebookresearch/SlowFast

Christoph Feichtenhofer, Haoqi Fan, Jitendra Malik, Kaiming He• 2018

Related benchmarks

Task	Dataset	Result
Action Recognition	Something-Something v2 (val)	Top-1 Accuracy63.9	545
Action Recognition	Kinetics-400	Top-1 Acc81.5	498
Action Recognition	UCF101	Accuracy92.8	433
Action Recognition	Something-Something v2	Top-1 Accuracy63.1	363
Action Recognition	UCF101 (mean of 3 splits)	Accuracy96.8	357
Action Recognition	UCF101 (test)	Accuracy95.756	357
Action Recognition	Something-Something v2 (test)	Top-1 Acc63.1	333
Action Recognition	Something-something v1 (val)	Top-1 Acc51.2	257
Action Recognition	Kinetics 400 (test)	Top-1 Accuracy79.8	245
Video Classification	Kinetics 400 (val)	Top-1 Acc79.8	204

Showing 10 of 236 rows

...

Other info

Code

Follow for update

@wizwand_team Discord