Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks

Shu Wang

Network and Distributed System Security (NDSS) Symposium 2024 · Day 2 · Audio & Voice Security

Automatic Speech Recognition (ASR) systems have become an integral part of modern life, powering virtual assistants, dictation software, and critical content moderation platforms. However, their widespread adoption has also made them a prime target for sophisticated audio attacks. This talk, presented by Shu Wang at the NDSS Symposium, delves into a particularly insidious vulnerability: **spectrum reduction attacks**. These attacks involve generating adversarial audio by meticulously removing non-essential frequency components from a speech signal. The remarkable characteristic of these modified audio samples is that they remain perfectly intelligible to human listeners, yet ASR systems consistently misinterpret them, leading to incorrect transcriptions.

Watch on YouTube