YouTube is boasting that a whopping 1 billion videos on the service now include closed-captioning for deaf and hearing-impaired users. That certainly sounds impressive — except when you realize that many of the site’s automatically generated captions aren’t completely right.
The Google-owned video giant first launched captions back in 2006, and three years later introduced automatic speech recognition to add closed-captioning to YouTube content. Today, YouTube users watch video with auto-generated captions more than 15 million times per day.
But the system is prone to errors. For example, the trailer for Amazon Studio’s Oscar-nominated “Manchester by the Sea” (at this link) includes numerous inaccuracies in the auto-transcribed captions, sometimes to hilarious — not to mention frustrating — effect.
“My heart was broken nose is broken too,” Michelle Williams’ character says to Casey Affleck’s, according to the YouTube captions. What she actually says is, “My heart was broken. I know yours is broken, too.”
YouTube recognizes the limits of the automatic closed-captioning. “A major goal for the team has been improving the accuracy of automatic captions — something that is not easy to do for a platform of YouTube’s size and diversity of content,” YouTube product manager Liat Kaver wrote in a blog post Thursday.
Since introducing auto-CC in 2009, YouTube claims it has boosted accuracy by 50% for videos in English by improving speech recognition and machine-learning algorithms, and expanding its training data. Ideally, according to Kaver, every video would have an automatic caption track generated by YouTube’s system that would then be reviewed and edited by the creator.
YouTube creators and publishers also can also submit their own closed-captions. When that’s not the case, YouTube captions all videos in 10 supported languages — English, Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian and Spanish — unless there’s poor sound quality or the system otherwise can’t detect the speech. Users have the option to remove the auto-generated captions, as well as enable community contributions to let viewers write the captions and/or translate the speech instead.
Meanwhile, the number also reveals that YouTube has at least 1 billion video clips. Google has never disclosed how many videos are hosted on the global network.
Pictured above: Amazon Studios’ “Manchester by the Sea” trailer on YouTube with automatic closed-captioning.