Working at the Skolkovo Institute of Science and Technology, the scientists have been able to simplify “realistic neural talking head models” which normally require a huge dataset of images to look genuine.
The researchers created life-like talking heads with just a few images of a person and even in some cases just a single image.
“Here, we present a system with such few-shot capability,” write the scientists. “It performs lengthy meta-learning on a large dataset of videos, and after that is able to frame few- and one-shot learning of neural talking head models of previously unseen people as adversarial training problems with high capacity generators and discriminators. Crucially, the system is able to initialize the parameters of both the generator and the discriminator in a person-specific way, so that training can be based on just a few images and done quickly, despite the need to tune tens of millions of parameters. We show that such an approach is able to learn highly realistic and personalized talking head models of new people and even portrait paintings.”
The video example of the study shows a painting of the Mona Lisa talking animatedly. A photo of Albert Einstein is also brought to life.
“They can make video of you saying things from a single photograph,” tweeted Joe Rogan after watching the video. “I feel like the water just dramatically pulled back from the shore and we’re about to experience a tsunami of fake shit.”
They can make video of you saying things from a single photograph. I feel like the water just dramatically pulled back from the shore and we’re about to experience a tsunami of fake shit. https://t.co/oIy70xSPrI
— Joe Rogan (@joerogan) May 27, 2019
Numerous people have warned that bad actors could exploit deep fake technology to frame people for doing and saying things that never actually happened.
Terrorists or rogue states could also use the technology to fake world leaders making hoax statements that could be used for propaganda or even to start wars.
However, the scientists behind the project argue that the technology will have positive purposes.
“It will lead to a reduction in long-distance travel and short-distance commute,” writes Egor Zakharov. “It will democratize education, and improve the quality of life for people with disabilities. It will distribute jobs more fairly and uniformly around the World. It will better connect relatives and friends separated by distance.”
Zakharov says that in the future, people will be represented by “realistic semblances of themselves” and that concerns over “deep fakes” are overblown because “Hollywood has been making fake videos (aka “special effects”) for a century” and hoaxes are easily detected.