SilentCipher: Deep Audio Watermarking
About
In the realm of audio watermarking, it is challenging to simultaneously encode imperceptible messages while enhancing the message capacity and robustness. Although recent advancements in deep learning-based methods bolster the message capacity and robustness over traditional methods, the encoded messages introduce audible artefacts that restricts their usage in professional settings. In this study, we introduce three key innovations. Firstly, our work is the first deep learning-based model to integrate psychoacoustic model based thresholding to achieve imperceptible watermarks. Secondly, we introduce psuedo-differentiable compression layers, enhancing the robustness of our watermarking algorithm. Lastly, we introduce a method to eliminate the need for perceptual losses, enabling us to achieve SOTA in both robustness as well as imperceptible watermarking. Our contributions lead us to SilentCipher, a model enabling users to encode messages within audio signals sampled at 44.1kHz.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Audio Watermarking | GuitarSet | Survivability Detection Rate100 | 23 | |
| Audio Watermarking | LibriSpeech | Detection Accuracy99.19 | 23 | |
| Audio Watermarking | jaCappella | Survivability Rate46 | 23 | |
| Watermark Robustness | Freischuetz | Survivability97.26 | 16 | |
| Watermark Robustness | AIR | Survivability18.6 | 16 | |
| Audio Watermarking | Audio Robustness Benchmark averaged across 14 attacks | PESQ4.15 | 11 | |
| Audio Watermarking Robustness | LibriSpeech and Common Voice (test) | No Attack Robustness100 | 10 | |
| Audio Watermarking | Clotho | Detectability Accuracy96.7 | 7 | |
| Audio Watermarking | PCD | Detection Accuracy96.7 | 7 | |
| Audio Watermarking | MAESTRO | Detectability Accuracy96.7 | 7 |