Security researchers in China have invented a clever way of activating voice recognition systems without speaking a word. By using high frequencies inaudible to humans but which register on electronic microphones, they were able to issue commands to every major “intelligent assistant” that were silent to every listener but the target device.
The team from Zhejiang University calls their technique DolphinAttack (PDF), after the animals’ high-pitched communications. In order to understand how it works, let’s just have a quick physics lesson.
Here comes the science!
Microphones like those in most electronics use a tiny, thin membrane that vibrates in response to air pressure changes caused by sound waves. Since people generally can’t hear anything above 20 kilohertz, the microphone software generally discards any signal above that frequency, although technically it is still being detected — it’s called a low-pass filter.
Review of this article
Connected objectsConnected objects can be a big problem for cybersecurity as they lack proper standards to implement basic measures. Delivered connected objects to end users often are not enough secured, their default parameters are too open. It can lead to some troubles.
For example, if someone enters your home, he can order many things with your own credit card just by asking Alexa.
Attacks vectorsConnected objects also open new attacks vectors as the way to interact with them is different than the classic keyboard or pad. You only need to speak to them.
Voice recognition is more sensible than keyboard taping as it is not a binary operation. A keyboard touch is active or not and that's all, voice recognition handles far more states than that. It is an analogical channel, the captor does not have the basic 0 or 1 bit, it captures frequencies and transform them to be used by the system. That creates a lot of side effects, the system can not handle all sent informations and understand them.
In the attack explained in the original article, researchers use harmonics principle in sound propagation. Apparently, this is not handled correctly by the current voice recognition systems. So the attacks can occur by this way.
Is it considered as a Side-channel attack by experts ?