Transcription is a high-skill process that involves listening to a recording, researching the subject, understanding context,  and typing it accurately into text. When done right, the process can take a lot of time.

One hour of audio or video can take 4-9 hours to transcribe, depending on the subject, number of speakers, and audio quality.

The total time it takes to transcribe a recording depends on several factors and differs for different types of recordings.

Factors that affect transcription time

  1. The subject - recordings that require researching spellings/terms takes longer to transcribe. technical recordings usually take a longer time to transcribe.

  2. Multiple Speakers -  Many transcription services charge extra for transcribing recordings with multiple speakers, which is fair because to accurately transcribe recordings with multiple speakers is truly hard work.

  3. Audio Quality  - Recordings done without using external microphones or outside have issues with background noises , echoes, and low volume.  Compared to clear recordings these take a longer time to transcribe because it takes many rounds of listening to understand what’s being said.

  4. Transcription Style - The transcription style dictates how detailed the transcript is going to be. There are three transcription styles that are generally used in transcription – verbatim, intelligent verbatim, and true verbatim.

  • Verbatim transcription is the most popular style which includes every word said on the recording, minus distractions like fillers, false starts, etc.

  • Intelligent verbatim transcription includes detailed edited and paraphrasing to create business transcripts that are error-free.

  • True verbatim is a highly detailed style of transcription that includes every little detail on the recording including laughter, ambient sounds, etc.

This style takes much more time than say intelligent verbatim where all these details are left out while transcribing.


  • A simple one (1) hour recording with clear speaking, no background noises and non-technical terminology can take up to 4 hours to transcribe

  • A one (1) hour complex recording (background noises, poor quality recording, over talking, etc.) can take from 5 to 9 hours to record. 


