Frequently Asked Questions

Heuristics To Simplify Your Workflow


WHAT IS TRANSCRIPTION?

Transcription is a process which converts speech (either live or recorded) into a written or electronic text document. Transcription services are often required for business, legal, or medical purposes.

The most common type of transcription is from a spoken-language source into text such as a computer file suitable for printing as a document such as a report.

WHO ARE TRANSCRIPTS FOR?

  1. Transcripts are very useful for people who are deaf, hard of hearing, or have some other hearing related disability. Adding transcripts to your content makes your content much more accessible. Making your content accessible is a good thing, but can also benefit you in unexpected ways. Hearing impaired people will often listen to your content, simply because transcripts are available, because they may not have many other choices!
  2. Many language learners find transcripts very useful, and would like to use your content to learn a foreign language. Reading the transcripts while they listen to the audio, or watch the video enables these listeners to enjoy your content.
  3. Transcripts are very good for Search Engine Optimisation (SEO). For audio, this could mean show notes or episode descriptions. For video this could mean title, description and other social media specific metadata. In certain social media channels, where the video is mute-played, open captions may be very useful.

Many people find transcripts very useful.

WHAT IS TRANSCRIBING AUDIO? HOW MANY KINDS OF AUDIO TRANSCRIPTIONS ARE THERE?

Transcribing audio is the process of converting an audio file from speech to text. For the most common case of podcasting, when transcribing audio there are three different approaches:

SHOW NOTES (ALSO KNOWN AS EPISODE DESCRIPTIONS)

It is very useful to add comprehensive show notes or episode descriptions. For good SEO we highly recommend:

  1. Write a thorough description making sure you spell check and grammar check.
  2. Highlight topics covered in the podcast while being mindful of commonly used search terms and topics.
  3. Add relevant links and resources, this will help your podcast rank better on search engines.

TRANSCRIPTS

If done correctly transcripts of your podcast can be very useful. Follow the below guidelines:

  1. Post Edit your transcript to ensure accuracy.
  2. Remove unscripted banter such as ‘umms’ and ‘ahhs’ to make the transcript more suitable for search engines to index.
  3. Avoid repeated content which may be present in the introduction and closing portions of your podcast.
  4. If available, add a pdf or epub to the social media channel you are using to describe your podcast.

If you are looking to make your podcast accessible to hearing impaired, or otherwise disabled people, closed captioning should be added directly to the media file. Note that transcripts often do not syndicate with the underlying media object directly.

CLOSED CAPTIONING

Closed captioning is similar to transcripts. However, it will often include additional information for accessibility requirements, such as [CLAPPING] or [LAUGHTER] to describe events in the podcast.

  • Add closed captions if you wish to make your podcast more accessible.
  • Closed captions can be inserted into audio files like mp3/m4a audio and mp4/m4v video.
  • Closed captions are generally inserted using ID3 tags in mp3 files for example. Depending on your use case, it may also be useful to provide a pdf transcript of your podcast.

WHAT MAKES FOR AN EFFECTIVE AUDIO TRANSCRIPTION?

For much more effective audio transcription we recommend you consider the below suggestions.

SPELLING AND GRAMMAR

It is very important you do post-editing after any machine transcription. Even if you have manually transcribed your content, it is highly recommended you check the work prior to publishing.

Bad spelling and grammar makes reading your content harder, and any transcripts may not get indexed appropriately by search engines due to reduced accuracy.

Another Artificial Intelligence/Machine Language limitation is that people do not speak the same as they write. It is often required to add appropriate spelling edits (such as capitalisation for proper nouns) and grammar (the most common case is full-stops) post AI transcription.

SIGNIFIERS

It is quite important that you add signifiers to your transcript. This is doubly important for audio or podcast transcripts. This includes the number of speakers, names of the speakers, any characters the speakers may be playing, background noise, manner of speech etc.

FORMATTING

Formatting is often overlooked. There are two cases where this is quite important.

  1. In audio transcripts, consistency in marking speaker tags, indentation as well as font, and maintaining the same pattern throughout the whole document, are all very important to get right. The idea is that a nicely formatted audio transcript should be easy to read and make the experience a pleasurable one.
  2. In video transcription, use technology such as auto-overlay to use a nice font, font-size, font-colour to make it easy for the viewer to follow along. This is especially important in social media where the underlying video may be likely to be played in a audio-mute environment.

SUMMARISING, VERBATIM OR SHOW NOTES

The idea here is to add appropriate information, both for your audience and for search engines (so your audience can find your content!).

  • Summarising: This is very useful for short description fields in various social media channels.
  • Verbatim: Use this for closed captions to promote accessibility for your content.
  • Show Notes: Use this for long form social media search indexing. Ensure you do keyword searches and optimise for SEO/SEM.

HOW DO YOU TRANSCRIBE AUDIO?

To transcribe audio content, follow the below process.

  • Select your preferred language template, and create a new item.
  • Upload your video and click Actions > Transcribe.
  • Select the language and dialect of the video and click the accept button to trigger the transcription. The transcription process will run in the background.
  • Once the transcription is complete, go through and check the transcription for proper grammar e.g. capitalisation and full stops. Embed the captions into the asset, or download the srt file separately.

HOW LONG DOES IT TAKE TO TRANSCRIBE AUDIO?

While there is no official metric for how long transcription takes, when transcribing manually, it is generally accepted that transcription will take 4:1, or about four (4) times as long as the underlying content being transcribed.

So, for a fifteen (15) minute audio file, it would take about one (1) hour to transcribe manually. This is good for slow, clear speakers, with not much muffled or garbled audio and no one speaking over each other.

Machine transcription is much quicker, but is affected by different factors. File size, bit rates, Internet speed, all these factors affect translation speeds. An easy approach is to use small files (video) optimised for the social media channel you are looking to deploy to, and not worry too much as AI is very quick.

In machine translation, you can use the rule of thumb as 1:4, or about quarter (14) times as long as the underlying content being transcribed. So, for a one (1) hour audio file, it would take about fifteen (15) minutes to transcribe with an Artificial Intelligence (AI), assuming a relatively high quality video file.

WHAT FACTORS THAT AFFECT TRANSCRIPTION TIME FOR PEOPLE?

Several factors can affect transcription time for people. We recommend you use Video Translator to do the first pass of the transcription, and then spend your time post-editing the result for maximum effectiveness.

However, you may need to spend more time editing, under the following circumstances.

  • Audio quality
  • Single speaker or many? A worked example here.
  • Background noise
  • Coherence: Do the different speakers talk over each other? Are they emotional when they speak, and is it easy to understand? Do they speak quickly or slowly?
  • Regional dialects: Make sure you choose the correct dialect from the available options to improve the transcription accuracy.
  • Nomenclature: Names, places, specialised terminology, subject matter specific shorthand etc.

HOW MUCH DOES IT COST TO TRANSCRIBE AUDIO?

The cost varies depending on the length of your video. You pay a $10 per month, per language service fee, which gets you a $10 credit toward the services you use. The cost to transcribe audio is:

  • Speech To Text: $0.0024 (per second)
  • Text To Speech: $0.000864 (per word)

So, for example, with your $10 credit, you get about 1 hour 9 minutes of speech to text per month. Anything beyond that will cost an additional $0.0024 per second.

CAN I TRANSCRIBE AUDIO WITHOUT A VIDEO?

Yes. If you have a separate audio file, you do not need a video to use Video Translator to transcribe audio. Simply upload the audio file into an audio component, and use the AI to transcribe you preferred language and dialect.

HOW CAN I ADD CAPTIONS TO A VIDEO?

How to add captions to a video using Video Translator:

  • Select your preferred language template, and create a new item.
  • Upload your video and click Actions > Transcribe. Alternately, if you already have captions, copy and paste this into the Captions tab.
  • Click the ‘Add Captions’ button to add to embed captions to a video. Alternately, for additional styling options, use the Auto-Overlay functionality.

HOW TO EMBED SUBTITLES INTO VIDEO USING OUR AUTO-OVERLAY FEATURE?

Our Auto Overlay feature allows you to embed subtitles into a video rather than keeping them separately in an srt file. Here’s how:

  1. Once you’ve transcribed your captions, click Action > Auto-Overlay. Then select your video asset.
  2. From here, you can customise the font, font-size, font-color, and opacity. Next you can add a striped background behind your text, or a proportionally inserted highlight. You can choose the colour and opacity of your highlight.
  3. Click confirm. Your video asset is now ready to download with embedded subtitles and your preferred styling.
  4. You can see more detailed instructions on how to embed subtitles into video using our auto overlay feature on our blog here.

HOW TO EMBED SUBTITLES INTO MP4?

We see this question a lot as it is the most common video type across social media platforms. As long as you have the transcription text or srt file, you can embed subtitles into your mp4 or any format video file.

HOW CAN I TRANSLATE A VIDEO?

It is simple to translate any video with Video Translator.

TRANSCRIBE THE AUDIO

  1. Select myTemplate and create a new item.
  2. Upload your video and click Actions > Transcribe.
  3. Select the language and dialect of the video. Click Accept. The transcription process will run in the background.
  4. Once the transcription is complete, go through and check the transcription for proper grammar e.g. capitalization and full stops. Click the + button to add it to your video.

TRANSLATE THE CAPTIONS OR SUBTITLES

  1. Once you have the captions, click Action > Translation.
  2. Select your preferred language. Click Accept. The translation process will run in the background.
  3. After the translation is complete, a new file will appear with transcription in the new language. We recommend having someone who speaks the language copy edit it.
  4. Now you can add captions to your video. We have detailed instructions on translating a video from English here.

TRANSCRIBE THE NOW TRANSLATED CAPTIONS INTO SPEECH

  1. Open your translated text file and copy paste the captions into the audio file.
  2. Click Transcribe File and select from the available voices.
  3. Add the audio into the original video asset as an overlay. Remember to mute the original audio so that you can hear only the translated language when the video plays.
  4. You can see an example of translating audio into German here.

HOW LONG DOES IT TAKE TO TRANSLATE A VIDEO?

There are several factors which can impact translation time.

  • The AI translation will happen very quickly in seconds.
  • It is highly recommended that you engage a human for post-editing.

A normal workflow is (a) use the AI to do the first pass, (b) use a human subject matter expert for post editing.

DO YOU HAVE A FILE SIZE LIMIT?

Yes - by default there is a limit of 100mb on the Free-Trial to avoid abuse. On subscribing this increases to 500mb. Larger file sizes are available on request.

HOW MUCH DOES IT COST TO TRANSLATE A VIDEO?

The cost varies depending on how many words are in your caption file. You pay a $10 per month, per language service fee. The cost to translate text is:

  • Text-to-text: $0.000540 (per word)

Roughly translating 100 words would cost about $0.10 USD.

HOW MANY LANGUAGES CAN VIDEO TRANSLATOR TRANSLATE? WHICH LANGUAGES ARE SUPPORTED BY VIDEO TRANSLATOR?

The Video Translator supports over 60+ languages and 120+ dialects.

A full overview of our language capabilities is available here.

HOW MUCH DOES VIDEO TRANSLATOR COST TO USE?

We charge a Base Fee and a Usage Fee. The Base Fee is $10 per month, per language. So, if you were to use the platform for 2 languages, it would be $20 per month, and so on and so forth.

The Usage Fee is dependent on which services you use. The prices are below.

TRANSCRIPTION PRICING

  1. Speech To Text: $0.0024 (per second)
  2. Text To Speech: $0.000864 (per word)

TRANSLATION PRICING

  1. Text To Text: $0.000540 (per word)

You can find more information on our fee structures here.

We also offer Managed Services and Enterprise Solutions for additional cost.

CAN I TRY VIDEO TRANSLATOR FOR FREE?

Yes. You can trial Video Translator for up to 5 minutes of content transcribed and translated video free of charge. Click here for more details. Need more time? Drop us an email.