Ever notice how the ads Facebook pushes and Tiktok recommends to watch are now very close to what you’ve been talking about with friends and family lately? Today, microphones are embedded in almost everything, from our phones, watches and TVs to voice assistants, and they are always listening to you. Computers are constantly using neural networks and artificial intelligence to process your voice in order to get information about you. If you want to prevent this from happening, how can you go about it?
Researchers at Columbia Engineering have developed a new system that generates whispered sounds that you can play in any room in any situation to stop smart devices from spying on you. And it’s easy to implement on hardware such as computers and smartphones, enabling people to protect the privacy of their voices.
Staying ahead of the conversation
While the team’s results in disrupting automatic speech recognition systems are theoretically known to be possible in artificial intelligence, implementing them fast enough to be used in real-world applications remains a major bottleneck. The problem is that the voice that interrupts a person’s speech right now at this particular moment is not the voice that will interrupt speech a second later. When people speak, their voices change constantly as they say different words and speak very fast. These changes make it nearly impossible for a machine to keep up with the rapid pace of a person’s speech.
“Our algorithm is able to keep up by predicting the characteristics of what a person will say next, giving it enough time to produce the correct whisper,” said Mia Chiquier, first author of the study and a doctoral student in the Vondrick lab. “So far, our approach works for most English words, and we plan to apply the algorithm to more languages and eventually make whispering sound completely undetectable.
Launching a “predictive attack”
Researchers need to design an algorithm that can disrupt neural networks in real time, can be generated continuously as speech is uttered, and is applicable to most words in the language. While earlier work has met at least one of these three requirements, none has met all three. transcription of any word. In addition, when the attack sounds are broadcast wirelessly, they need to be large enough to disrupt any rogue “listening” microphones that may be far away. The attack sounds need to carry the same distance as the voices.
The researchers’ approach achieves real-time performance by predicting future attacks on the signal or word, conditional on two seconds of input speech. The team optimized the attack so that its volume resembled normal background noise, allowing people in the room to talk naturally without being successfully monitored by the automatic speech recognition system. The team successfully demonstrated that their method works in real rooms with natural ambient noise and complex scene geometry.
The Ethical Bottom Line of Artificial Intelligence
“For many of us in the research community, the ethics of artificial intelligence technology is a fundamental issue, but it seems to belong to a separate thought process. It’s like being so happy that we finally built a driving car, but forgot to design the steering wheel and brakes,” said Jianbo Shi, a professor of computer and information science at the University of Pennsylvania and a leading researcher in machine learning. “As a community, we need to ‘consciously’ think about the human and societal impacts of the AI technologies we develop from the earliest stages of research design. research by Mia Chiquier and Carl Vondrick asks the question: “How can AI be used to protect us from accidental AI use of AI? Their work has led many of us to think in the following direction: Instead of asking what ethical AI can do for us, what ethical boundaries can we hold for AI? Once we believe in this direction, the study of AI ethics is equally interesting and creative.