Deepfake AI – Dangers of (Deep) Faking It Till You Make It


Article by: Amanda Nechesa

Publication date:

On 23rd April 2024, Larry Madowo shared a video of him on social media that he insisted was not real. The fake video had allegedly circulated, prompting Madowo to set the record straight. But how could it not be real? The video clearly shows his face, and the voice on its audio is undoubtedly his. So how was it not real? 

Image by Vilius Kukanauskas from Pixabay

When I first came across the video, in which the CNN International Correspondent advertises an online game, I knew it was fake solely because Larry Madowo had superimposed on it the words ‘FAKE’. This surprised and stoked my interest. I wondered, had I encountered this video on social media without seeing Larry Madowo’s reaction to it, would I have known it was fake? 

Madowo also posed that question to his audience. In the deepfake AI future, how will we tell what’s real from what’s doctored? This question sounds ridiculous if you think about it. We, of course, want to believe we know what’s real. When you walk, and feel the ground under your feet, you don’t question if the ground is real. You know it is. The people you meet daily – your mama mboga, the stylist or barber who does your hair, the shopkeeper who sells you milk – you don’t question these interactions because you know you really are experiencing them. 

The reason we know they are real is because they exist in the physical realm, and that’s how we know. But with the advent of social media, certain aspects of technology have blurred our perceptions. We spend almost ninety per cent of our lives glued to a screen, and even though we cannot touch the things we see, we want to believe, just like in our lives, that we are not being duped. 

Enter deepfake AI. Also known as synthetic media, deepfake AI is a type of artificial intelligence that uses deep learning to generate convincing yet fake multimedia including videos, audios and images. With deepfake AI, there is only one goal – to produce an output so authentic that the human eye cannot detect it’s not. 

Alarming as that might sound, companies responsible for creating deepfake AI claim to do so with the general good in mind. Recently, Open AI launched Voice Engine, an audio deepfake AI tool that uses text input and a 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker. 

Some of the intended uses for Voice Engine, as listed in Open AI’s press release, include providing reading assistance, reaching global communities, helping patients recover their voices, and supporting people who are nonverbal. From the demonstrated samples of how the tool can be useful, it’s clear that the intention behind it is pure.

Dangers of (deep) faking it till you make it 

However, in the same press statement, Open AI also recognises the potential risks that Voice Engine poses, the main one being impersonation. To curb this risk, Open AI has put measures in place including partners testing its usage policies that prohibit impersonation of individuals or organisations without consent, watermarking the generated output to trace the origin and a requirement to explicitly disclose that the generated output is AI and not real.

These measures are important, but remember that Open AI is only one Artificial Intelligence company in a sea of many others. Even though it has put policies in place to prevent the risks of misusing deepfake AI, it doesn’t mean that the rest of the AI companies will follow suit. 

Currently, there are about 67,200 AI companies globally, and most of them are trying out the deepfake AI technology. This means that there are, as we speak, developers creating and distributing deepfake AI tools that allow users to create fake media and share it as authentic without necessarily caring about usage policies. 

We can already witness the implications of this from Larry Madowo’s AI-generated video in which he advertises an online game. Now, imagine a case where Madowo was a politician, and the audio generated had politically incorrect views. Unless explicitly stated that the video is AI-generated, the many people viewing it might have taken the views stated in the fake video as real, which might have resulted in a political war. And if the video was shared during an election year, who knows how that could affect the output of an election? 

This is not just an imagined scenario. The use of deepfake AI in politics is an ongoing threat. In Slovakia, which held its parliamentary elections last September, a deepfake audio of the progressive party leader plotting to rig the elections went viral on social media. And recently in the US, which is currently in an election year,  a deepfake audio of President Joe Biden urging citizens not to participate in the New Hampshire primary state elections also spread on the internet. 

But politics aside, let’s look at another scenario. Assume that Madowo is your relative, and someone creates a deepfake video of him in a rather compromising position and requesting financial assistance. Without knowing that the video is fake, you would most likely be inclined to offer this assistance, and just like that, you would have been conned. 

Catching deepfake AI 

From impersonation to fraud, threat to privacy and security to the spread of misinformation and identity theft, the risks of deepfake AI are many. And perhaps the most alarming part about all this is that what we are witnessing now is just the tip of the iceberg. Artificial intelligence is still evolving, and the media deepfake AI can create now might seem real in our eyes, but in a few years, it will almost be indistinguishable from what we consider as authentic. 

To repeat Larry Madowo’s question, when that time comes, how will we know what to trust? How can we identify that what we are watching, hearing or seeing is not real? 

You will be both happy and sad to learn that yes, it is possible to distinguish deepfake AI media from a real video. Happy because it sounds like there is light at the end of the AI tunnel, and sad because, get this, the only sure way to do this – to identify the subtle differences between AI-generated output from real input – is by using AI itself. 

Yes, you heard me right. The only way to catch AI is by using AI. Talk of the old saying: set a thief to catch a thief. 

Deep learning, which is a form of machine learning that creates deepfake AI, works by using two models. The first one takes in the raw input and tries as much as possible to mirror this input to generate an output that significantly resembles it.

After the output is generated, the second model – whose purpose is to test if the output can be identified as fake – comes in. If it finds that it cannot, the system is deemed successful. These two models work side by side, constantly improving with each output to create as much authentic fake media as possible. 

If let’s say, someone wants to prove that the generated media is not real, the only way they can do this is by using an AI trained to identify the subtle differences between the raw input and the AI-generated output. The human eye or ear, at some capacity, cannot. 

I will once again refer to Larry Madowo’s video as an example of this. Once Madowo shared the video on social media, I scrolled through the comments and noticed two things from the people who watched it. One group of people were obviously concerned about the risks of deepfake AI just like Madowo. 

However, there was also another group of people who laughed it off, saying they knew from the very beginning the video was fake because of the way it got some parts of Madowo’s face wrong. Like the shape of his eyebrows, the colour of his eyes, or his accent. But at the end of the video, Madowo also shares the real video used to generate the fake one, and from the human eye, there is not one detail that the AI had gotten wrong, contrary to the belief of this group of people.

If this is starting to sound a little hopeless to the value we hold of human integrity, that’s because it is. After all, we are in the era of (deep) faking it till we make it.