Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)

About

This report describes our submission to the ActivityNet Challenge at CVPR 2019. We use a 3D convolutional neural network (CNN) based front-end and an ensemble of temporal convolution and LSTM classifiers to predict whether a visible person is speaking or not. Our results show significant improvements over the baseline on the AVA-ActiveSpeaker dataset.

Joon Son Chung• 2019

Related benchmarks

TaskDatasetResultRank
Active Speaker DetectionAVA-ActiveSpeaker (val)
mAP87.8
107
Active Speaker DetectionAVA-ActiveSpeaker v1.0 (val)
mAP87.8
27
Active Speaker DetectionAVA-ActiveSpeaker (test)
mAP87.8
22
Active Speaker DetectionAVA-ActiveSpeaker v1.0 (test)
mAP87.8
13
Active Speaker DetectionAVA-ActiveSpeaker ActivityNet Challenge 2019 (test)
mAP87.8
9
Showing 5 of 5 rows

Other info

Follow for update