Text-To-Speech: AI Dubbing And What You Can Do With It
by Tat Banerjee| Oct 15, 2019
Text-To-Speech: AI Dubbing And What You Can Do With It

Today we are going to look AI dubbing and its various use cases. If you are wondering, is my content suitable for AI dubbing, or AI localisation? In simple terms:

  • What is dubbing or dub localisation?
  • Should I use a human, or a machine for my dubbing?
  • How do I use a video translator to dub my video?
  • Rules of thumb, or heuristics, to achieve a superior outcome with AI dubbing?

On reviewing this visual guide, you will be able to (1) upload your own video content, (2) transcribe your content, (3) dub your content into a different dialect.

Note: We are going to take a video in Australian English, and switch it into American English. Obviously, the usual use case is to translate into some other language first, and then do the dubbing. We forego the translation so this guide is easier to understand.

Basics

From Wikipedia, ‘Dubbing, mixing, or re-recording is a post-production process used in film making and video production in which additional or supplementary recordings are “mixed” with original production sound to create the finished soundtrack.’

In everyday usage however, the term 'dubbing' commonly refers to the replacement of the actor's voices with those of different performers speaking another language, which is called 're-voicing', or a 'voice-over' in the film industry.

Outcomes - Before And After

To make it obvious what we are talking about, the after is provided.

Steps - Transcribe Your Content Using The Provided AI

  1. We are working with an English video today, so select the English template, and create a new item. Note that the account we are using to show this functionality has Chinese, French and Vietnamese as translation options. We will not be using this functionality today.
    AI Dubbing: Select the English Template and create and new Item
  2. This is the original content. Please watch video below.
  3. We know the speaker is an Australian so we select English (Australia) as the preferred dialect to transcribe with - note picking the correct dialect makes AI accuracy much better.
    AI Dubbing: Use Action -> Transcribe and then select Australian English
  4. Once transcription is complete, use the Picture-In-Picture to clean up the transcript. Note that the text shown in the PIP is real time, so you can edit both timestamps/text of the transcript and see the changes directly.
  5. You can see we removed text fragment 9 and merged it into text fragment 8 as an example. This makes no difference to the rendering of the captions. Use the Auto-Index button to fix this up if you require your *.srt to work properly.
    AI Dubbing: Use the Picture-In-Picture to clean up the transcript
  6. Save your changes and your transcription is complete. If you need the captions burnt into the asset, use the Auto-Overlay. Nicely done!

Steps - Dubbing/Adding A Voice Over With AI

  1. To dub the content using a different voice, our first step is to copy the captions from our video component into our audio component. Click into your editor, and hit Ctrl-A and Ctrl-C.
    AI Dubbing: Select the captions and copy using Ctrl-A and Ctrl-C
  2. Go to the Audio component, click add captions and paste your content. Use the highlighted button to add captions without adding an audio file.
    AI Dubbing: Go to an Audio component, and paste your captions there using Ctrl-V
  3. We will transcribe again this time using the Text-To-Speech Transcription AI. In the below image, we use the Text-To-Speech tab. It is also possible to break up your captions into fragments, as shown below. We will only use the first fragment, but you can break your content to produce different *.mp3 files.
    AI Dubbing: Use the Speech-To-Text AI to speak out your content
  4. A few quick notes. The options shown are the different voices, with (M) for Male and (F) for Female. The accent can also be seen and we are using English (US) (M) which is American English Male. The S/P indicates Standard, or Professional - we recommend Professional, Standard is simply an older technology (sounds more robotic). Once done, it looks like below.
    AI Dubbing: Audio component after text-to-speech transcription
  5. The audio content is below. Have a listen! Sweet - dubbing is complete.

Steps - Overlaying Our Audio On Our Video

  1. We are now going to add our generated audio to our original video. Go back to your video component, and click on Audio Overlays. Click the + and your audio fragment will be shown.
    AI Dubbing: Add the AI speech using Audio Overlay
  2. Notice, we have not set any times. Add (in seconds) the time. We added 0.5 and 21 as start and end times, and then clicked the + button to bake in the audio overlay. That is the entire process. The final output is shown below.

Conclusion

In this visual guide, we used our video translator to dub, or overlay a different voice, on our video. It should be noted, that most clients prefer to download assets and do the last, audio overlay step themselves. But you can use our app if you think it make sense! Send us some feedback if you do.

Please connect with us on LinkedIn, YouTube or Facebook for any comments, questions, or just to keep up to date with the work we do!

We are very grateful for your support!

If you are interested in trying out our technology, please try our platform or drop us an email at hello@videotranslator.ai.

Share on
Related Posts
© Video Translator 2024 (ABN: 73 602 663 141) - All Rights Reserved