It is a new technology based on a paper by Computer Vision and Pattern Recognition by Ting-Chun Wang, Arun Mallya, Ming-Yu Liu, which can be used for Storytelling, Video Conferencing, and virtual assistants.
It all starts with speech. The NVIDIA Team said that they transmit only the voice to your machine or the internet, and the voice drives an AI model called “Audio to Face,” which is the technology that takes the speech's input and generates lip-sync facial motions and expressions for a 3D head model in real-time.
The system generates natural 3D facial motion, including emotions, lips, eyes, and head motion. The movement of the 3D head model is fed to another AI model called Victobit, which can animate a photo of a person. Since it takes the audio as input, we can drive facial animation using any voice or language. We speak about the power of creating a digital avatar out of a single photo.”
Enjoy watching this 6 minutes from SIGGRAPH 2021 Real-Time Live Demo now!
Nvidia is an American multinational technology company based in Santa Clara, California. The company produces several graphics processing units for the gaming industry and systems on a chip unit for the automotive and mobile computing market. Their products are also widely used in the 3D Industry of any field.