3x Increase In Transcription Efficiency

3x Increase In Transcription Efficiency

> 3x Increase In Transcription Efficiency Tat Banerjee [January 06, 2020]

At VideoTranslator, we do a lot of work in what is called the Internationalization And Localization industry.

That being said, we are not transcribers, translators, or voice over artists. We use AI to do transcription, translation and synthetic (AI) dubbing.

When do we do this work for our clients? Generally when a client needs a managed service. This happens when a client is looking to try out tech and/or better understand the value proposition, or has their own reasons for us to provide this service.

When this happens, the first task on our side is simple old transcription. That is what we are going to look at today - how do to a simple transcription.

How Long Does Transcription Generally Take?

Google is our friend here. From the team at Opal Transcription Services, "The industry standard is four hours of transcription time for one hour of clear audio, or a 4:1 ratio – that is, one hour of transcription time for a 15-minute-long recording."

Google Search: How Long Does Transcribing Take?
Google Search: How Long Does Transcribing Take?

This is a pretty good way of thinking about it. Generally it will take about 4x the time of the video content.

Can You Really Do Transcription At 3x Faster?

Maybe. We think so, but there are caveats. This is how we did out testing. For this demonstration, we will use our standard Your Money: Peter Switzer video.

Your Money was a short lived channel, but Peter Switzer has a very distinctive Australian accent, so we use this clip as a standardised test bed for a a number of different processes internally.

The below is how we tested our hunch.

Step 1: Create a New Item And Upload The Video

Step 1 is the same every time. Select the relevant template and upload the video.

3x Increase: Select the correct template and create a new item
3x Increase: Select the correct template and create a new item

Once the new item opens, upload the video - once uploaded it looks like below.

3x Increase: Upload your video using the highlighted button
3x Increase: Upload your video using the highlighted button

Step 2: Use Action -> Transcribe to Transcribe Your Video

3x Increase: Action -> Transcribe to trigger the AI transcription
3x Increase: Action -> Transcribe to trigger the AI transcription

Click on Action -> Transcribe to use the AI to transcribe your content. We used Australian English here.

3x Increase: Action -> We selected Australian English here...
3x Increase: Action -> We selected Australian English here...

Depending on your file size, this can take time. The Your Money video is about 5 mb, and takes milliseconds. Basically, the bigger your video the longer it will take.

Step 3: Clean Up The Transcription - Post Editing

3x Increase: Action -> We selected Australian English here...
3x Increase: Action -> We selected Australian English here...

This is where the majority of the work takes place. Here is what you need to do:

Scroll Down And The Video Will Pop Out (Picture-In-Picture)

This is point 3 in the image above. The text in yellow is a projection, so you can change the colour (point 1) for ease of transcription.

Edit Times And Text

The editor works in real time (point 2), make changes as you go. Realistically, simply hit play and edit to your hearts content.

Download The SRT, Or Copy Paste The Content

Depending on what you are doing, you will either (i) add open captions and download the video, (ii) download the SRT file, or (iii) access the captions directly.

We recently added the ability to directly access the captions without the time stamps - option (iii) above. Click the button highlighted in the below image to use this functionality.

3x Increase: Action -> Use the toggle to get the captions separately
3x Increase: Action -> Use the toggle to get the captions separately

Did We Get To A 3x Efficiency Gain?

The above was not super scientific but we did get to the 3x gain. Effectively, the above means that the time it takes to transcribe the video is a little bit more than the video length itself.

This is due to stopping starting the PIP while you correct the captions. We found, for a 15 minute video, transcribing takes us about 20 minutes in total give or take. To be precise, 18 and a bit, but this is dependent on how good the original AI produced transcript is, which is in turn dependent on the kind of content in the video.

Assuming 20 minutes though, given some video content will be faster and some will be slower, we get an improvement from 60 minutes -> 20 minutes, giving us the 3x improvement.

Conclusion

Using the original 4:1 ratio, we think the average video of 15 minutes will take 20 minutes to transcribe, as opposed to 60 minutes for a fully human transcription. This is how we got to the 3x improvement.

Curious about how we could help your business? Check out our managed service, or try our app for free! Alternately drop us an email at hello@videotranslator.ai.