This is a technical overview of how closed captions are generated and made available with your video. There are two sections to this article. The first is about how the captions are generated and the second details how the captions make it into your video.
If you’ve ever been in a crowded restaurant, you’ve no doubt seen text displayed on TV. Those are called captions and they have to be created either by a highly-trained person called a captioner, or they can be generated by software.
Broadcast Quality Captions
Broadcast captions refers to having someone, called a captioner, generating cBroadcast quality captions refers to having a human, called a captioner, generating captions as your event is happening. Often the captioner has been trained as a court reporter, or stenographer and uses a stenotype machine with a phonetic keyboard and special software, which allows them to type hundreds of words-per-second with almost perfect accuracy.
Typically, captions are generated remotely, meaning that the captioner will not be on site but logged into your video signal remotely via your Caption Encoder. more on that below.
It’s an emerging technology, so it goes by a lot of names: AI, automated, automatic, etc., but the process is simply using software to generate captions instead of a human. The advantage to using software is that it tends to be less expensive, however this cost savings decreases accuracy, especially in multi-speaker environments like public meetings.
What Is The Difference?
Accuracy and quality. Automated captions are simply not as accurate as Broadcast Quality captions. In addition, human captioners may capture disfluencies like “ummm” or “aah”, or even non-speech sounds like [audience laughter].
How Accurate Do You Need To Be?
This is the difficult choice. That determination must be made by your jurisdiction. Broadcast Quality captions will provide your viewers with the highest quality captions available, but automated captions will be lower cost.
Regardless of the method that you use to have your captions generated, you will typically use one of the video pipelines listed below. Which one you choose will depend on if you’re simulcasting your Live Stream on Television, or not.
Hardware Caption Encoder
If you have a TV station where your video signal is being broadcast then you will need a hardware encoder to inject the captions into your video pipeline.
A hardware encoder is typically a one-time purchase, which CHAMP can provide, or you can choose to purchase one from a 3rd party. Once installed, your captioner, or captioning service will need to connect to the hardware from outside your network in order to inject captions into your video signal.
If you are providing captions in one medium, such as your web stream, then you’ll need to provide them all mediums.more here
Software Caption Encoder
If you are only streaming via your web site, then you can choose to use a software encoder instead. Software encoders are less expensive, with no upfront costs, and would be added to your Annual Renewal.
As you can see in the image above, the captions are injected into your video signal after it leaves your network and before it reaches the CHAMP data center. Your caption provider would connect to the cloud-based encoder and not access your local network.