Post Slate / The Slate / Spec / VTT caption files and broadcast
Filed 11 May 2026
7 min read · 1,650 words
Spec Docket / S-004-26 Filed by Editor, The Slate · Issue 03 · 2026

VTT caption files,
broadcast spec,
and what to deliver.

If a client asked for a VTT file for a broadcast deliverable, somebody made a wrong turn. WebVTT is the caption format for HTML5 video, HLS, and most web platforms — not for over-the-air broadcast (which needs SCC) and not for premium SVOD master delivery (which needs TTML/IMSC1.1). Here is the actual map.

WebVTTW3C CR Candidate Recommendation (2019)
32 Char/line, CEA-608 broadcast
160–180wpm BBC subtitle reading rate
4 Formats Post Slate ships per job

Short answer: a WebVTT (.vtt) file is the right deliverable for web video, HLS streaming, and most video platforms (YouTube, Vimeo, Wistia, Brightcove). It is the wrong deliverable for linear broadcast (which needs an SCC file carrying CEA-608) and for Netflix-grade SVOD masters (which require TTML/IMSC1.1).

Quick Answer

Who, what, when

Who: editors, post houses, and ad-ops leads asked to deliver "a caption file" for a video. What: caption format depends on the playout system. VTT for HTML5/HLS/web. SCC for FCC-regulated linear broadcast. TTML for premium streaming masters. SRT for casual web/social. When: confirm with the platform's spec sheet before recording the deliverable.

WebVTT (Web Video Text Tracks) is a W3C Candidate Recommendation that defines a text-based caption file format for use with HTML5 video. It is well-supported by every major web video player and is the format Apple's HLS authoring spec calls out by name. It is not, however, a broadcast format — and the language gets muddled because clients and brands use "VTT" as a generic synonym for "caption file."

What is a WebVTT file, exactly?

WebVTT is a plain-text, UTF-8 caption file with a fixed structure. The file starts with the literal signature WEBVTT on its first line. Each caption (a "cue") has three parts: an optional identifier, a timing line with the format HH:MM:SS.mmm --> HH:MM:SS.mmm, and a payload of caption text.

MIME type: text/vtt. File extension: .vtt. Cues support positioning settings (line, position, align, size, vertical) and inline voice tags like <v Speaker Name> for speaker attribution. The full specification lives at the W3C: www.w3.org/TR/webvtt1/.

Two notes worth holding. First, WebVTT is still formally a Candidate Recommendation at W3C, not a full Recommendation — the latest dated snapshot is from 2019. Second, the format is owned by the same web platform that defines HTML5 video, which is why it is the lingua franca of HTML5 and HLS players and why it does not appear in any FCC regulation.

Will a VTT file pass FCC broadcast review?

No. The FCC's broadcast caption rules at 47 CFR § 79.1 regulate quality (accuracy, synchronicity, completeness, placement) but do not specify a file format directly. What they require is captioning that conforms to the CEA-608 (analog NTSC) or CEA-708 (digital ATSC) standard, which a VTT file does not.

In practice that means a broadcast deliverable carries captions one of two ways: embedded inside the video file as CEA-608 (Line 21) or CEA-708 data, or as a sidecar SCC (Scenarist Closed Caption) file that the broadcaster can encode at playout. WebVTT is neither. If a station's traffic department is expecting captions for an air master, hand them an SCC or an embedded master — not a VTT.

Where does the 32-character-per-line rule come from?

Not from WebVTT, not from the FCC. The 32-character-per-line ceiling is a property of CEA-608, the analog NTSC Line 21 captioning standard. The format encodes 32 columns by 4 rows because of the fixed 480-bit-per-second bandwidth available on Line 21 of the analog video signal.

SCC files carry raw CEA-608 byte pairs and inherit the 32-character constraint. WebVTT inherits nothing from CEA-608 — lines can technically be longer — but most broadcast and streaming clients still apply the 32-character convention to VTT and SRT files for readability and downstream conversion.

Reading-rate guidance has a similar story. The frequently-cited 17 characters per second figure comes from Netflix's Timed Text Style Guide, not from the FCC. The BBC's published subtitle guidelines use roughly 160–180 words per minute for adult content. The FCC, again, specifies quality ("a speed that can be read by viewers") rather than a number.

VTT vs SCC vs TTML vs SRT — which goes where?

Four formats, four lanes. Use the table; this is the question everyone is actually asking.

Caption format by delivery channel
Format What it is Use for Don't use for
VTT W3C web caption format HTML5, HLS, YouTube, Vimeo, Wistia, Brightcove Linear broadcast, Netflix masters
SCC CEA-608 byte pairs FCC-regulated linear broadcast, ad delivery (XR, CTS) Web players (most don't render SCC)
TTML XML-based timed text Netflix, Amazon, DASH-IF, IMSC1.1 masters Lightweight web embeds
SRT SubRip text, informal Social uploads, internal review, sub-deliverable Any broadcast or premium-streaming master

Two formats that are easy to mistake for broadcast deliverables but are not: VTT and SRT. Both are text-based, both are easy to generate, both are accepted by web platforms. Neither appears in 47 CFR Part 79, in Netflix's Partner Help Center delivery spec, or on the Apple HLS authoring spec as a broadcast or premium-streaming master format.

Which platforms accept VTT today?

WebVTT is accepted by every major web-video platform: YouTube (alongside SRT, SCC, TTML, SBV), Vimeo (alongside SRT), Wistia (which converts VTT to SRT on upload and only accepts those two), Brightcove, and the HTML5 <track> element on any modern browser.

WebVTT is not accepted as a master deliverable by Netflix (which mandates TTML1, plus IMSC1.1 for Japanese), is not present in Apple's HLS authoring spec as the only option (IMSC1 in fMP4 is the alternative), and is not part of linear broadcast delivery. For an FCC-regulated air master, the caption pipeline runs CEA-608 / CEA-708 inside the video container, with SCC as the sidecar handoff format.

VTT is for HTML5, HLS, DASH, and web platforms. SCC is for linear broadcast. TTML is for premium SVOD. SRT is for social and review — not for any spec sheet. — The Slate, reading across W3C, FCC, Apple, and Netflix delivery specs

If your client asked for VTT, what should you actually deliver?

First, ask which platform the video will live on. Nine times out of ten, the answer is a web player or an HLS stream, and a real WebVTT file is exactly what they need. The remaining one in ten, the client meant "a caption file" generically — usually because their CMS uses "VTT" as the label on the upload field — and the correct deliverable is either an SCC for broadcast or a TTML for premium streaming.

Practical safety net: ship all four formats. Post Slate's package outputs VTT, SRT, TTML, and SCC from one source so whichever lane the project lands in, the right file is already there. If you cut your own caption pipeline, build it to author once and export the four; converting between them by hand later is where most of the bugs in caption delivery come from.

Three things to keep on your caption-delivery checklist

  1. Match the file format to the platform's published spec. Apple HLS = WebVTT or IMSC1. FCC broadcast = SCC for the air master. Netflix = TTML / IMSC1.1. YouTube and Vimeo = WebVTT works fine. Do not assume one format covers all four.
  2. Sanity-check the encoding. WebVTT must be UTF-8 and start with the literal WEBVTT signature on line one. SCC files must be drop-frame for 29.97 broadcast (timecode begins at 00;00;00;00 or 01;00;00;00 depending on the broadcaster's spec). Wrong encoding is the most common reason an otherwise-valid caption file fails QC.
  3. Read the captions back against the picture. A 32-character-per-line ceiling is meaningless if a key line is truncated mid-thought. The four FCC pillars — accuracy, synchronicity, completeness, placement — are checked by humans, not by validators. Watch the file end to end before you hand it off.

Editor's note. Prepared by The Slate, Editorial. Published 11 May 2026. WebVTT specification details are from the W3C Candidate Recommendation dated April 4, 2019. Broadcast caption rules are from 47 CFR § 79.1 and § 79.102. Netflix and Apple delivery specs reflect current published guidance and are subject to change.