How to Add Captions & Subtitles to Videos
Why captions matter, captions vs subtitles, open vs closed, the file formats (SRT/VTT), and how to time captions so they’re actually readable.
Captions aren’t an afterthought any more — they’re how most people watch video. The bulk of social video plays on mute, so the words on screen often are the message. Here’s how to add captions properly, from why they matter to getting the timing right.
Why captions matter
- Mute-by-default viewing. Most feed video autoplays silently — captions keep your message landing.
- Accessibility. Captions make your content usable by deaf and hard-of-hearing viewers (and a legal requirement in many contexts).
- Watch time & reach. Captioned videos consistently hold attention longer and travel further.
- SEO. A closed-caption track is indexable text — search engines can read what your video is about.
Captions vs. subtitles
They’re not quite the same thing. Captions assume the viewer can’t hear the audio, so they include speaker IDs and sound cues like [applause]. Subtitles assume the viewer can hear but may not understand the language, so they translate speech only. On social, people use “captions” loosely for both.
Open vs. closed captions
- Open captions are burned into the video — always on. Best for TikTok, Reels and feed video where captions must appear on autoplay.
- Closed captions are a separate track the viewer toggles (the [CC] button). Best for YouTube and players that support them — and they’re the version search engines can index.
The file formats: SRT and VTT
A subtitle file is just plain text listing each cue’s timing and words:
- SRT (SubRip) — the universal format. Accepted by YouTube, Vimeo and virtually every editor.
- VTT (WebVTT) — the web standard for HTML5 <track> captions, with styling support.
Both work the same way: a cue number (SRT), a start and end timestamp, then the caption text.
How to add captions
- Auto-generate, then fix. Use auto-captioning to get a first draft, then correct errors — auto-transcription always needs a human pass for names, punctuation and timing.
- Edit for readability. Condense to the essential words. Captions don’t have to be word-for-word; they have to be readable in the time available.
- Break lines sensibly. Keep to ~42 characters per line and a maximum of two lines, breaking at natural phrase boundaries.
- Check the reading speed. Aim for ~17 characters per second; never exceed ~21. Run any cue through the Caption Reading Speed Calculator.
- Export the right format. SRT for upload to YouTube/social, VTT for web players, or burn-in open captions for feed video.
Timing captions so they’re readable
The most common mistake is captions that flash by too fast. A caption should stay on screen long enough to read comfortably — roughly five-sixths of a second at minimum, up to about seven seconds — and never push past ~21 characters per second. If a cue reads too fast, trim the text, extend the duration, or split it in two. Our caption calculator flags exactly which cues are over the limit.
Captioning is one piece of the workflow — the TelePRO app handles captions alongside prompting, recording and export, so your video goes from script to publish-ready in one place. See the app, or start by sizing your script with the Words to Time Calculator.