Imagine yourself walking down the aisle of a shopping mall. Everywhere around you are posters and advertisements for shops and products. However, instead of simple posters they now move, they dance, and turn dramatically to face you with their products. They also seem to talk directly to you, calling your name, asking if you want a new car, or if you need to quench your thirst with a Guinness. This was the reality in the 2002 film Minority Report: a science fiction film starring Tom Cruise as a detective investigating a murder he was going to commit himself.
This sci-fi future might become reality as a Chinese e-commerce company is doing research on new ways to create simple facial accurate deepfakes. About a month ago, researchers at JD Technologies released a new AI facial animation model named Structure Aware Face Animation (SAFA). SAFA is an evolution over an older model named First Order Motion Model (FOMM) which was released in late 2019. Both SAFA and FOMM take a single image as input and animate the image using a driving video as example.
FOMM quickly became popular and is being used in several apps and products. A variant of the model is used in Wombo, a mobile application that you can use to make anyone sing by uploading just one picture. It is also used in myheritage.com to animate old photos from for example your ancestors. Deepfakes are being used commercially for all kinds of purposes.
Now, JD Technologies is doing their own research into facial animations, probably for e-commerce applications. Next to JD Technologies there are several such companies all doing their best to create fast and easy deepfakes. Examples are companies such as Synthesia, that builds an online platform for video generation, or Smarterpix, a German stock photo company that has recently made stock AI-generated humans available for licensing. But why are so many companies investing in this technology?
There is definitely a market for these kinds of fake images or “neural images” as Synthesia is calling them. One of the reasons to use these for let’s say a new commercial, is that all the human elements of the commercial are done by the AI. The voice actors, the camera men, the lightning crew, and even the actors are all replaced by this new deepfake technology. This is much cheaper than renting and paying a large crew of creators. Also, deepfake technology allows for quick iterations and revisions. Not happy with the results? Then tweak the program and hit enter again and out pops the new version of your video. If this is all already available today, then what will be available tomorrow?
It is only a matter of time before some companies will have deepfake technology running day and night to create all kinds of advertisements. This allows for new unique advertisements every day. All will have the same message but a new person or celebrity will say it slightly differently every time. When the technology becomes even faster it will be possible to create advertisements on the spot as you click open a new page.
Just imagine you browsing YouTube on your phone and one of those annoying ads shows up. But it calls your name and asks if you're thirsty for Guinness. Something like this:
In a couple of years this type of advertisement might be common when you're browsing the web: an online world where videos try to get your attention by calling your name or even using faces you know. What about ads in which your mother tells you what she would like to have for her birthday?
This form of highly personalised targeted advertising may seem like a scene from a dystopian science fiction movie, but with online advertising companies developing things like facial animation or the metaverse we are heading towards such a world.