Abstract
Talking human face generation aims at synthesizing a natural human face that talks in correspondence to the given text or audio series. Implementing the recently developed Deep Learning (DL) methods such as Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN)s, Neural Rendering Fields (NeRF) for data generation, and talking human face generation has attracted significant research interest from academia and industry. They have been explored and exploited recently and have been used to address several problems in image processing and computer vision. Notwithstanding notable advancements, implementing them to real-world problems such as talking human face generation remains challenging. The generation of deepfakes created by the abovementioned methods would greatly promote many fascinating applications, including augmented reality, virtual reality, computer games, teleconferencing, virtual try-on, special movie effects, and avatars. This research reviews and discusses DL related methods, including CNN, GANs, NeRF, and their implementation in talking human face generation. We aim to analyze existing approaches regarding their implementation to talking face generation, investigate the related general problems, and highlight the open study issues. We also provide quantitative and qualitative evaluations of the existing research approaches in the related field.
Original language | English |
---|---|
Article number | 119678 |
Journal | Expert Systems with Applications |
Volume | 219 |
DOIs | |
State | Published - 1 Jun 2023 |
Bibliographical note
Publisher Copyright:© 2023
Keywords
- 3D face generation
- Autoencoder
- Datasets
- Deep generative model
- Evaluation metrics
- Mel spectogram
- Neural networks
- Neural radiance field
- Talking human face animation
- Unsupervised learning