Google introduces AI for automatic video dubbing
Google DeepMind team presented a tool for generating video soundtracks using artificial intelligence.
According to the developers, AI models for video creation are developing by leaps and bounds, but they are mostly dumb videos. V2A (“video-to-audio”) technology makes it possible to bring them to life.
The technology makes it possible to generate music in the spirit of the video, sound effects, and even dialogues for characters based on a text description. The AI model at the heart of the technology was trained on the basis of sounds, dialog transcripts, and videos.
No one will be surprised by models for generating music and sounds today. But according to V2A developers, the difference between their technology is that it is able to understand the video sequence and automatically synchronize the generated audio to it specifically, taking into account the user’s request.
DeepMind admits that the technology is not perfect. And because the training dataset contained few videos with artefacts and other defects, it’s not very easy to create audio for them in V2A.