Dubbing movies into different languages presents significant challenges in maintaining the emotional nuances and authenticity of the original dialogue. Current methods often result in either human-dependent dubbing, which can be time-consuming and costly, or robotic dubbing lacking emotional depth and synchronization with background music. This project introduces an innovative audio dubbing approach that seamlessly integrates background music and speaker expressions into dubbed audio.
- Seamless Integration: The proposed model seamlessly integrates background music and speaker expressions into the dubbed audio, ensuring a cohesive viewing experience.
- Advanced Techniques: Leveraging a denoiser to distinguish background music and noise, and a state-of-the-art speech-extraction model, the model accurately extracts vocals and expressions from the video file.
- Text-to-Speech Synthesis: The final step involves infusing the background music and speaker expressions with the voice generated by a text-to-speech model, resulting in high-quality dubbed audio.
- Empirical Evidence: To validate the effectiveness of the proposed approach, a benchmark dataset of movies for Hindi-to-English dubbing is introduced, providing comprehensive empirical evidence.
Given a Hindi movie video, our model dubs the movie into English with the same expression and background music.
- Code: Contains the implementation of the proposed audio dubbing model, including denoising algorithms, speech-extraction models, and text-to-speech synthesis.
- Data: Includes the benchmark dataset of movies for Hindi-to-English dubbing, facilitating reproducibility and further research.
- Documentation: Provides detailed instructions for running the code, training the model, and evaluating performance.
Block diagram of the proposed model. (a) The video file with the subtitles is the input. (b) The Audio and subtitled text are processed and translated simultaneously. The vocals and the background information of the audio are separated. (c) Background information and the vocals are integrated after dubbing the speech. (d) The final output movie dubbed into English.
Refer to the 'Implementation.txt' for the instructions
The complete dataset will be made public once the paper is accepted.
- Contributors' names will updated once the paper is accepted.