MediaPipe Hands: On-device Real-time Hand Tracking
About
We present a real-time on-device hand tracking pipeline that predicts hand skeleton from single RGB camera for AR/VR applications. The pipeline consists of two models: 1) a palm detector, 2) a hand landmark model. It's implemented via MediaPipe, a framework for building cross-platform ML solutions. The proposed model and pipeline architecture demonstrates real-time inference speed on mobile GPUs and high prediction quality. MediaPipe Hands is open sourced at https://mediapipe.dev.
Fan Zhang, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, Matthias Grundmann• 2020
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Action Recognition | NTU RGB+D 120 Cross-Subject | Accuracy73.4 | 183 | |
| Action Recognition | NTU RGB+D 120 (Cross-View) | Accuracy81.2 | 47 | |
| 2D hand pose estimation | H2O (test) | PCK@0.286.22 | 6 | |
| Hand Pose Estimation | FreiHAND (random data split 80/10/10) | EPE7.45 | 5 | |
| Hand Detection | HaDR real camera (test) | PDQmax Score8.36 | 4 | |
| Hand Pose Estimation | FreiHAND final (test) | PCK@0.281.73 | 4 |
Showing 6 of 6 rows