AI-based voice cloning induces new scams
Vijay Balasubramaniyan, the CEO, CTO, and Co-founder of Pindrop Security, warns the world about a new online scam concerning audio deepfakes. Cybercriminals have started cloning people’s voices with AI-based software, using them for scams and fraudulent practices. A loss of $7 million has been reported due to voice cloning instances.
At the RSA Conference, Balasubramaniyan explained how fraudsters use AI software to clone someone’s voice for scamming purposes. The fraudsters, after gaining access to the victim’s audios or videos, modify and synthesize them with the AI-based software to put the victim at risk.
Jonathan Bloom tweeted regarding the event:
Here at #RSAC, @pindrop CEO, CTO and co-founder @Vijay_Voice just showed us how a presidential candidate’s voice can be faked in seconds from speech samples scraped off YouTube — and how his company’s software can spot it. ? #RSAC2020 pic.twitter.com/JxmftMXyaz
— Jonathan Bloom (@BloomTV) February 25, 2020
The fraudsters have also been found to compromise business emails, where they modify the voice of a senior officer to trick employees into fulfilling money transfer requests initiated by the fraudsters. Five minutes are enough to record realistic audio clips, however, if the software records audio clips for periods longer than five hours, it can bluff beyond imagination.
Nonetheless, the audio deep faking threat is slight when compared to phone call scams involving identity theft. The CEO also displayed the demo of a system his company developed to produce voices from popular personalities. For fun purposes, their software deep faked the US President Donald Trump’s voice. This also raised concerns regarding how deepfakes can spread misinformation to fool the masses.
For now, the positive aspect of this is that computer scientists have started working on finding solutions to detect deepfakes. In fact, Pindrop has successfully created an AI-based system that can differentiate human speech from deepfake audio clips. It first analyzes how the spoken words are pronounced by a real human being and then matches the recorded voice with the human speech patterns.
All in all, this imminent threat of audio deep fakes will soon compel users to carefully upload their voice and video clips on the internet.