Make video accessible, localised, mobile and searchable by captioning

The World Health Organization estimates 278 million people worldwide have some form of hearing impairment.

A Nielsen study suggests that there has been over a 300 percent increase in online video watching since 2003. Further, most watching is done during work hours. Workplace computers are often muted or have no speakers.

Several billions of videos are watched monthly worldwide, with many of them in different

See for more about this.

I had my first foray into captioning this week for a short video that a colleague Daniel Davis (@ourmaninjapan) did on remote debugging with Opera Dragonfly and Opera Mobile 10 to mark the release of Opera Mobile 10 Beta (go try it, it’s free).

I’m slightly pink faced to say I’ve not done any captioning before having always opted to transcribe video and audio so I had to start from scratch sourcing the right tool and figuring out how to go about editing and setting up a process. Whilst I set out to caption a video my purpose was also to see how easy or difficult it was as captioning is the poor cousin of accessibility considered to be expensive, time-consuming and only relevant to hand-full of people.

Before I launch into my findings below is the final product captioned in English, Japanese and Russian using  Overstream and hosted on YouTube. Big thank you to Daniel for the translation and original video and Vadim for the Russian. You can also watch the video on Easy YouTube.

There’s also a hidden Easter Egg in there, see if you can spot it.

Captioning benefits

Accessibility – this is the obvious benefit as you’ll be opening up your content to deaf and hard of hearing users as well as people find it easier to read rather than listen (or do both together). If you don’t have translated captions some non-native speakers may also find content easier to consume when reading captions.

Localisation – adding translations to your captions widens your potential audience massively. There are plenty of tools out there such as dotSUB that enable you to crowdsource translations and many hosts such as YouTube which support multiple caption tracks.

Mobile – users with mobile phones who may not have earphones or are in a noisy place also benefit. I do wonder how much can be visible on some small screens but certainly some people will find it useful.

Search – site indexing may also get a boost. For example YouTube supports video searching of caption data which also filters through into Google search.

Getting the right tool

There are more captioning tools out there than I’ve had hot dinners so I thought I’d narrow it down scientifically and just ask over Twitter what people recommended. My only stipulations were that it had to be quick, easy and free (what else!).


I gave Google’s web based tool CaptionTube a go first. It’s super easy to get started as you just use a Gmail login and from there you upload video from your YouTube collection. So far so simple.

What I didn’t find so intuitive was the captioning interface itself. When dropping text into the timeline I wasn’t able to clearly see when text started and ended as the end time was measured in how long the segment was rather than when it stopped in the overall timeline. This just didn’t work for me.

The CaptionTube interface fails to show captions overlaid the video

In addition to that I had to flick between the Timeline and Preview screens to see the captions I’d just created overlaid on the video. With pages taking a time to download, not to mention breaking the rhythm of what I was doing, this really held me back. Too much buffering for my liking.


Being a newbie to all this I wasn’t sure if I was expecting too much or missing the point but after a chat with Antonia Hyde – who knows a thing or two about accessible multimedia – I decided to switch to Overstream which had originally been recommended by AbilityNet.

This was altogether a lot better plus Overstream support a number of video providers: YouTube, Google Video, MySpace Video, Dailymotion, Veoh and It was pretty easy to upload a YouTube video but equally easy to miss a crucial instruction that you need to have the video in question playing in YouTube when you hit the upload button.

Overstream shows the edit box and video with captiones overlaid on the same page.

Overstream shows the edit box and video with captions overlaid on the same page.

The interface gave me much more of an integrated toolbox and by now I had an idea of what I wanted which helped. One huge bonus was being able to add text to the timeline, complete with start and end times, adjust time lengths and see in real time the text overlaid on the video on the same page.

I had a few problems trying to play the video once done in a new window with a URL warning popping up but it was easy enough to download the .srt file (with all the captions and timeline in) and upload that in turn to YouTube.


Next on my list to try is the downloadable tool MAGpie, from the National Centre for Accessible Media. I didn’t try it this time as Overstream got the job done plus MAGpie supposedly doesn’t play nicely with Intel based Mac’s. I did have a quick look at it however and while very clunky and old looking it does give you an the option to style captions which looks pretty good. I’ll be looking at this in more depth when I next caption something.

Stanford Captioning Service

John Folliot pointed me to Stanford Captioning Service which looks like an excellent service. All you need to do is upload a video file which then is put in multiple formats – FLV, MP4, MP3. These are then transcribed by Stanford contractors for a small fee. When the transcription is done Stanford do automatic timestamp generation to turn transcript into various formats – this part is free.

For my short video I was happy to transcribe and caption the audio myself but if I had longer videos to get caption I’d almost certainly use these guys. Victor Tsaran, head of accessibility at Yahoo!, used the Stanford Captioning service to caption a video about himself recently.


Lastly I dug out my login to dotSUB, who’s main selling point is enabling subtitling of videos on the web into, and from, any language. It’s also a collaborative tool so you can crowdsource community input and/or work collaboratively with your team to get the captions done. Of the tools tested this was by far simplest and easiest to use. 

Captioning tips

As soon as I got started I realised that I needed to have a process as to how I approached doing the actual work. Here are a couple of things that worked for me – let me know if you have any more worth adding to the list:

  • Transcribe text before you start captioning – you can do this yourself, pay a professional to do it or use voice recognition. Even though the last two options are less labour intensive you will need to edit and double check text – especially with voice recognition.
  • Break it down – once you have your transcript you’ll have a clear idea of the volume of words and quality. You can then break text into short sentences that fit on screen without obscuring too much of the screen real estate. All I did was use a text file and hit return after short sentences or natural breaks in a sentence. Once I started adding text to the timeline this had to be reworked as I went along but having it already drafted was a big help.
  • Editing text – if you have a text that works verbatim then great, but this is unlikely and there’s nothing wrong with removing repetitions or false starts to sentences. The key is to keep it succinct while maintaining the original meaning and flavour of the language as well as the character of the speaker.
  • Punctuation – I found that less is more. Obviously you want full stop at the end of sentences but Andrew Kirkpatrick, head of accessibility at Adobe, recommends removing commas at the end of lines. We don’t ‘see” punctuation when we hear people so visually breaking text down like this makes sense to me.
  • Timing – you can create a bit of drama, suspense and humour by remaining faithful to how people speak and using timing to replace tone. For example, someone getting excited may talk in short sentences so break the transcript down so that it is given in short segments rather than having longer segments.

Check out captioning tips from the WGHB Media Access Group, captioning tips and tools from NCDAE and W3C Multimedia FAQ for more.

How long did the whole process take?

Captioning the 4.27 minute video took be the best part of 10 hours BUT this included researching tools, false starts as well as a bit of reading around the subject. If it’s a long video you definitely want it to be transcribed for you but if a short one like this you could estimate 1 to 2 hours depending on your typing speed and how audible the sound is.

After that, once you have the hang of adding text to a timeline you should be ok. I added text and allocated times as I went along but you can add text then allocate time second if breaking the two tasks work better for you. This probably took me about 1.5 hours.

All in all I’d average out a 4 minute video at 3 hours – but this will no doubt get better as it becomes more familiar.

It’s a bit fiddly to start with but smooth running once you get the hang of it and seeing the end result is completely worthwhile. It’s satisfying to know that the captions will help not just deaf users but also non-native English speakers as well a people looking at video on their mobile phone.

Update 20 November 2009

Google have just announced automated captioning of YouTube video which will include automatic time stamping as well as transcripts. This should be available soon and will have a huge impact for many users as well as influence in promoting captioning overall.

26 thoughts on “Make video accessible, localised, mobile and searchable by captioning

  1. This is awesome timing Henny, I plan to caption some vids this afternoon! Was planning on blogging the experience, which I may still do, but you’ve certainly answered some questions before I even ask them!
    I was planning on giving a try too so maybe I’ll see how that compares to the ones you’ve tried here. My vids are pretty basic so I’m hoping I can turn them around fairly quickly.

  2. Yay, glad the timing worked out James. Just had a quick peek at and it looks pretty simple (no pun intended) and easy to use. I’d love to know how it goes so definitely blog it and I’ll add a link to the above.
    Have fun – it’s really quite satisfying when you get the finished result!

  3. This is a terrific article, Henny. Your tips for captioning are spot on.

    Could I possibly add a couple of alternatives to your list? The first is an application called Subtitle Workshop ( which I’d describe as a much cleaner, less clunky and more capable alternative to MAGpie.

    The second is my own company, – we’re the first subtitling company to specialise only in work for the internet. It’s a pretty new company, and the aim is to use technology to bring the cost of very high-quality DVD-style captions (created by the same captioners who work on DVDs for Hollywood studios) down to a level where anyone can afford them.

    I’m glad you’ve got to grips with captioning – millions of video viewers depend on captions, and – as organisations like the RNID and Stanford Captioning will tell you – a static transcript does not make a video truly accessible, it’s no substitute for captions at all in terms of accessibility. So the “poor cousin” status of captions is something we really need to turn around, and articles like this from high-profile folks like you are a great start. Thanks.

  4. Pingback: Captioning BSL videos « A Pretty Simple blog

  5. You can safely ignore Andrew WK’s advice to unsafely ignore the rules of English punctuation. He seems to think this is the early ’90s at WGBH, where they did indeed suppress commas at ends of Line 21 captions, ostensibly because comma looked like period. Funny, they didn’t do that with semicolons.

    The naïve captioner will make a large number of punctuation, transcription, and rendering errors without training. The last thing they need is licence to ignore the rules.

  6. Hi Henny,

    Thanks for the mention of Stanford’s Captioning Service. At this time, the service is pretty much ‘reserved’ for content producers on campus (although I’ve opened it to a few outside accessibility people to demo with), as managing the outsourcing of the transcription part (which is a billable service) is something that is handled internally. That said, if anyone out there is interested in our work-flow solution, I’d be happy to answer questions and provide a walk-through; it’s not magic, just efficient , and seeks to simplify the process for non-technical content producers. I can be contacted at jfoliot at

  7. Thanks for the suggestions people. I couldn’t test everything obviously so it’s good to have pointers to other write ups and suggestions.

    Joe – do you have any further information about punctuation in captioning that I can have a look at and link to? I found it quite a challenge to work out what worked best and would like to look into it further.

    John – thanks for the clarification. I’ll certainly point people in your direction if they have questions.

  8. Great review! I’m an intern working in Assistive Technology, and we are looking to put captions on our videos. I will definitely be checking out those programs. Thanks for sharing!

  9. Iheni,
    Speaking of search engines, I have been using lately ( ). It’s actually trying to index the captioned/subtitled videos on the Internet. There is quite a bit more than captions that I can’t explain too well. I’ll just refer to the about us page ( ). You can also post your own captioned videos and they will include them. Hopefully, they can get BBC iplayer, which is also worth mentioning.

  10. Great stuff, thank you Deborah, Bill and Gites and glad the article was of use Sarah.
    22frames looks pretty interesting – it’s so key to get this stuff indexed by search engines otherwise it just gets lost. Kind of defeats the point really…

  11. Pingback: Simple steps towards building an accessible site (part 2)

  12. Pingback: The Blind Buzz on Accessibility « The Blind Buzz

  13. Pingback: The Blind Buzz on Accessibility « AccessTech News

  14. Pingback: The Blind Buzz on Accessibility « The BAT Channel

  15. As location check-in services and other forms of social media develop there will be increasingly less
    space for pure privacy in social media. org, the biography should include the company’s
    industry and the benefits of a potential contract: it should provide a solid reason for
    a future collaboration: a price discount or even something free.

    Aside from helping your posts get better search engine rankings, the social metrics report can also assist you in determining which content needs to be

  16. I’ve been surfing online more than 3 hours lately, but I by no means found any fascinating article like yours.
    It is lovely worth enough for me. Personally, if all webmasters and bloggers made just right content as you did,
    the net might be a lot more useful than ever before.

Comments are closed.