index.bs

<pre class='metadata'>
Title: WebVTT: The Web Video Text Tracks Format
H1: WebVTT: The Web Video Text Tracks Format
Shortname: webvtt1
Status: CG-DRAFT
Default Ref Status: current
Prepare For TR: false
Group: texttracks
ED: https://w3c.github.io/webvtt/
TR: https://www.w3.org/TR/webvtt1/
Level: none
Editor: Gary Katsevman, Mux Inc. https://www.mux.com/, w3c@gkatsev.com
Former Editor: Silvia Pfeiffer, NICTA CSIRO https://www.csiro.au/, silviapfeiffer1@gmail.com
Former Editor: Simon Pieters, Opera Software AS http://www.opera.com/, simonp@opera.com
Former Editor: Philip Jägenstedt, Opera Software ASA http://www.opera.com/, philipj@opera.com
Former Editor: Ian Hickson, Google http://www.google.com/, ian@hixie.ch
!Participate: <a href=https://github.com/w3c/webvtt>GitHub w3c/webvtt</a> (<a href=https://github.com/w3c/webvtt/issues/new>new issue</a>, <a href=https://github.com/w3c/webvtt/issues>open issues</a>, <a href=https://www.w3.org/Bugs/Public/buglist.cgi?product=TextTracks%20CG&component=WebVTT&resolution=--->legacy open bugs</a>)
!Commits: <a href=https://github.com/w3c/webvtt/commits>GitHub w3c/webvtt/commits</a>
!Commits: <a href=https://twitter.com/webvtt>@webvtt</a>
Test Suite: https://github.com/web-platform-tests/wpt/tree/master/webvtt
Abstract: This specification defines WebVTT, the Web Video Text Tracks format. Its main use is for marking up external text track resources in connection with the HTML &lt;track> element.
Abstract: WebVTT files provide captions or subtitles for video content, and also text video descriptions [[MAUR]], chapters for content navigation, and more generally any form of metadata that is time-aligned with audio or video content.
Boilerplate: omit conformance, omit feedback-header
Ignored Terms: unicode-bidi, color, text-combine-upright, text-wrap, lang, class, title
Ignored Vars: seconds-frac, selector, fragment, seen cue


</pre>

<pre class=anchors>
urlPrefix: https://dom.spec.whatwg.org/
    type: dfn
        text: namespaceURI
urlPrefix: https://html.spec.whatwg.org/multipage/
    type: dfn
        urlPrefix: infrastructure.html
            text: ascii digits
            text: split a string on spaces
            text: skip whitespace
            text: alphanumeric ascii characters
            text: space character
            text: case-sensitive
        urlPrefix: embedded-content.html
            text: text track kind
            text: text track cue
            text: text track list of cues
            text: text track
            text: list of text tracks
            text: media element
            text: text track mode
            text: text track showing
            text: rules for updating the text track rendering
            text: text track cue active flag
            text: text track cue display state
            text: current playback position
            text: text track cue identifier
            text: text track cue pause-on-exit flag
            text: rules for extracting the chapter title
            text: text track cue start time
            text: text track cue end time
            text: unbounded text track cue
            text: expose a user interface to the user
            text: text track cue order
            text: honor user preferences for automatic text track selection
        urlPrefix: webappapis.html
            text: entry settings object
        urlPrefix: syntax.html
            text: character references; url: #syntax-charref
            text: additional allowed character
            text: consume a character reference
    type: element-attr
        urlPrefix: dom.html
            text: title; url: #attr-title
            text: lang; url: #attr-lang
            text: class; url: #classes
urlPrefix: https://encoding.spec.whatwg.org/
    type: dfn
        text: utf-8 decode
</pre>

<pre class=link-defaults>
spec:dom; type:interface; text:Document
spec:css-ruby-1; type:value; text:ruby-base
spec:css-color-4; type:property; text:color
spec:css-fonts-3; type:property; text:font-style
spec:css-fonts-3; type:property; text:font-weight
spec:css-ruby-1; type:value; text:ruby
spec:css-ruby-1; type:value; text:ruby-text
spec:css2; type:selector; text::lang()
spec:css-flexbox-1; type:value; text:inline-flex
spec:selectors-3; type:selector; text:::before
spec:selectors-3; type:selector; text:::after
spec:css-display-3; type:property; text:display
spec:html; type:element; text:style
spec:css-align-3; type:value; for:justify-content; text:flex-end
</pre>

<pre class=biblio>
{
    "MAUR": {
        "authors": [ "Shane McCarron", "Michael Cooper", "Mark Sadecki" ],
        "href": "http://www.w3.org/TR/media-accessibility-reqs/",
        "title": "Media Accessibility User Requirements",
        "status": "WD",
        "publisher": "W3C"
    }
}
</pre>

<style>
 samp {
   font-family: inherit;
   background-color: black; /* fallback if rgba() is not supported */
   background-color: rgba(0, 0, 0, 0.7);
   outline: 0.18em solid rgba(0, 0, 0, 0.7);
   color: white;
 }

 [data-algorithm]:not(.heading) {
   padding-left: 2em;
 }
</style>

<h2 id=introduction>Introduction</h2>

<p><i>This section is non-normative.</i></p>

<p>The <dfn>WebVTT</dfn> (Web Video Text Tracks) format is intended for marking up external text
track resources in connection with the HTML &lt;track> element.</p>

<p>WebVTT files provide captions or subtitles for video content, and also text video descriptions
[[MAUR]], chapters for content navigation, and more generally any form of metadata that is
time-aligned with audio or video content.</p>

<p>The majority of the current version of this specification is dedicated to describing how to use
WebVTT files for captioning or subtitling. There is minimal information about chapters and
time-aligned metadata and nothing about video descriptions at this stage.</p>

<p>In this section we provide some example WebVTT files as an introduction.</p>


<h3 id=introduction-caption>A simple caption file</h3>

<p><i>This section is non-normative.</i></p>

<p>The main use for WebVTT files is captioning or subtitling video content. Here is a sample file
that captions an interview:</p>

<pre>
WEBVTT

00:11.000 --> 00:13.000
&lt;v Roger Bingham>We are in New York City

00:13.000 --> 00:16.000
&lt;v Roger Bingham>We're actually at the Lucern Hotel, just down the street

00:16.000 --> 00:18.000
&lt;v Roger Bingham>from the American Museum of Natural History

00:18.000 --> 00:20.000
&lt;v Roger Bingham>And with me is Neil deGrasse Tyson

00:20.000 --> 00:22.000
&lt;v Roger Bingham>Astrophysicist, Director of the Hayden Planetarium

00:22.000 --> 00:24.000
&lt;v Roger Bingham>at the AMNH.

00:24.000 --> 00:26.000
&lt;v Roger Bingham>Thank you for walking down here.

00:27.000 --> 00:30.000
&lt;v Roger Bingham>And I want to do a follow-up on the last conversation we did.

00:30.000 --> 00:31.500 align:right size:50%
&lt;v Roger Bingham>When we e-mailed&mdash;

00:30.500 --> 00:32.500 align:left size:50%
&lt;v Neil deGrasse Tyson>Didn't we talk about enough in that conversation?

00:32.000 --> 00:35.500 align:right size:50%
&lt;v Roger Bingham>No! No no no no; 'cos 'cos obviously 'cos

00:32.500 --> 00:33.500 align:left size:50%
&lt;v Neil deGrasse Tyson>&lt;i>Laughs&lt;/i>

00:35.500 --> 00:38.000
&lt;v Roger Bingham>You know I'm so excited my glasses are falling off here.
</pre>

<p>You can see that a WebVTT file in general consists of a sequence of text segments associated with
a time-interval, called a cue (<a lt="WebVTT cue">definition</a>). Beyond captioning and subtitling,
WebVTT can be used for time-aligned metadata, typically in use for delivering name-value pairs in
cues. WebVTT can also be used for delivering chapters, which helps with contextual navigation around
an audio/video file. Finally, WebVTT can be used for the delivery of text video descriptions, which
is text that describes the visual content of time-intervals and can be synthesized to speech to help
vision-impaired users understand context.</p>

<p class=note>This version of WebVTT focuses on solving the captioning and subtitling use cases.
More specification work is possible for the other use cases. A decision on what type of use case a
WebVTT file is being used for is made by the software that is using the file. For example, if in use
with a HTML file through a &lt;track> element, the <a lt="text track kind">kind</a> attribute
defines how the WebVTT file is to be interpreted.</p>

<p>The following subsections provide an overview of some of the key features of the WebVTT file
format, particularly when in use for captioning and subtitling.</p>


<h3 id=introduction-multiple-lines>Caption cues with multiple lines</h3>

<p><i>This section is non-normative.</i></p>

<p>Line breaks in cues are honored. User agents will also insert extra line breaks if necessary to
fit the cue in the cue's width. In general, therefore, authors are encouraged to write cues all on
one line except when a line break is definitely necessary.</p>

<div class="example">

 <p>These captions on a public service announcement video demonstrate line breaking:</p>

 <pre>
 WEBVTT

 00:01.000 --> 00:04.000
 Never drink liquid nitrogen.

 00:05.000 --> 00:09.000
 &mdash; It will perforate your stomach.
 &mdash; You could die.

 00:10.000 --> 00:14.000
 The Organisation for Sample Public Service Announcements accepts no liability for the content of this advertisement, or for the consequences of any actions taken on the basis of the information provided.
 </pre>

 <p>The first cue is simple, it will probably just display on one line. The second will take two
 lines, one for each speaker. The third will wrap to fit the width of the video, possibly taking
 multiple lines. For example, the three cues could look like this:</p>

 <!-- 50 -->
 <pre>
 &nbsp;          <samp>Never drink liquid nitrogen.</samp>

         <samp>&mdash; It will perforate your stomach.</samp>
                 <samp>&mdash; You could die.</samp>

     <samp>The Organisation for Sample Public Service</samp>
     <samp>Announcements accepts no liability for the</samp>
     <samp>content of this advertisement, or for the</samp>
      <samp>consequences of any actions taken on the</samp>
         <samp>basis of the information provided.</samp>
 </pre>

 <p>If the width of the cues is smaller, the first two cues could wrap as well, as in the following
 example. Note how the second cue's explicit line break is still honored, however:</p>

 <!-- 25 -->
 <pre>
 &nbsp;     <samp>Never drink</samp>
     <samp>liquid nitrogen.</samp>

   <samp>&mdash; It will perforate</samp>
       <samp>your stomach.</samp>
     <samp>&mdash; You could die.</samp>

   <samp>The Organisation for</samp>
   <samp>Sample Public Service</samp>
   <samp>Announcements accepts</samp>
   <samp>no liability for the</samp>
      <samp>content of this</samp>
   <samp>advertisement, or for</samp>
    <samp>the consequences of</samp>
   <samp>any actions taken on</samp>
     <samp>the basis of the</samp>
   <samp>information provided.</samp>
 </pre>

 <p>Also notice how the wrapping is done so as to keep the line lengths balanced.</p>

</div>


<h3 id=styling>Styling captions</h3>

<p><i>This section is non-normative.</i></p>

<p>CSS style sheets that apply to an HTML page that contains a <a element>video</a> element can
target WebVTT cues and regions in the video using the ''::cue'', ''::cue()'', ''::cue-region'' and
''::cue-region()'' pseudo-elements.</p>

<div class="example">

 <p>In this example, an HTML page has a CSS style sheet in a <a element>style</a> element that
 styles all cues in the video with a gradient background and a text color, as well as changing the
 text color for all <a>WebVTT Bold Objects</a> in cues in the video.</p>

 <pre>
 &lt;!doctype html>
 &lt;html>
  &lt;head>
   &lt;title>Styling WebVTT cues&lt;/title>
   &lt;style>
    video::cue {
      background-image: linear-gradient(to bottom, dimgray, lightgray);
      color: papayawhip;
    }
    video::cue(b) {
      color: peachpuff;
    }
   &lt;/style>
  &lt;/head>
  &lt;body>
   &lt;video controls autoplay src="video.webm">
    &lt;track default src="track.vtt">
   &lt;/video>
  &lt;/body>
 &lt;/html>
 </pre>

</div>

<p>CSS style sheets can also be embedded in WebVTT files themselves.</p>

<p>Style blocks are placed after any headers but before the first cue, and start with the line
"STYLE". Comment blocks can be interleaved with style blocks.</p>

<p>Blank lines cannot appear in the style sheet. They can be removed or be filled with a space or a
CSS comment (e.g. <code>/**/</code>).</p>

<p>The string "<code>--></code>" cannot be used in the style sheet. If the style sheet is wrapped in
"<code>&lt;!--</code>" and "<code>--></code>", then those strings can just be removed. If
"<code>--></code>" appears inside a CSS string, then it can use CSS escaping e.g.
"<code>--\></code>".</p>

<div class="example">

 <p>This example shows how cues can be styled with style blocks in WebVTT.</p>

 <pre>
 WEBVTT

 STYLE
 ::cue {
   background-image: linear-gradient(to bottom, dimgray, lightgray);
   color: papayawhip;
 }
 /* Style blocks cannot use blank lines nor "dash dash greater than" */

 NOTE comment blocks can be used between style blocks.

 STYLE
 ::cue(b) {
   color: peachpuff;
 }

 hello
 00:00:00.000 --> 00:00:10.000
 Hello &lt;b>world&lt;/b>.

 NOTE style blocks cannot appear after the first cue.
 </pre>

</div>

<h3 id=introduction-other-features>Other caption and subtitling features</h3>

<p><i>This section is non-normative.</i></p>

<p>WebVTT also supports some less-often used features.</p>

<div class="example">

 <p>In this example, the cues have an identifier:</p>

 <pre>
 WEBVTT

 test
 00:00.000 --> 00:02.000
 This is a test.

 123
 00:00.000 --> 00:02.000
 That's an, an, that's an L!

 crédit de transcription
 00:04.000 --> 00:05.000
 Transcrit par Célestes&trade;
 </pre>

 <p>This allows a style sheet to specifically target the cues.</p>

 <pre>
 /* style for cue: test */
 ::cue(#test) { color: lime; }
 </pre>

 <p>Due to the syntax rules of CSS, some characters need to be escaped with CSS character escape
 sequences. For example, an ID that starts with a number 0-9 needs to be escaped. The ID
 <code>123</code> can be represented as "\31 23" (31 refers to the Unicode code point for "1"). See
 <a href="https://www.w3.org/International/questions/qa-escapes">Using character escapes in markup
 and CSS</a> for more information on CSS escapes.</p>

 <pre>
 /* style for cue: 123 */
 ::cue(#\31 23) { color: lime; }
 /* style for cue: crédit de transcription */
 ::cue(#crédit\ de\ transcription) { color: red; }
 </pre>

</div>

<div class="example">

 <p>This example shows how classes can be used on elements, which can be helpful for localization or
 maintainability of styling, and also how to indicate a language change in the cue text.</p>

 <pre>
 WEBVTT

 04:02.500 --> 04:05.000
 J'ai commencé le basket à l'âge de 13, 14 ans

 04:05.001 --> 04:07.800
 Sur les &lt;i.foreignphrase>&lt;lang en>playground&lt;/lang>&lt;/i>, ici à Montpellier
 </pre>

</div>

<div class="example">

 <p>In this example, each cue says who is talking using voice spans. In the first cue, the span
 specifying the speaker is also annotated with two classes, "first" and "loud". In the third cue,
 there is also some italics text (not associated with a specific speaker). The last cue is annotated
 with just the class "loud".</p>

 <pre>
 WEBVTT

 00:00.000 --> 00:02.000
 &lt;v.first.loud Esme>It's a blue apple tree!

 00:02.000 --> 00:04.000
 &lt;v Mary>No way!

 00:04.000 --> 00:06.000
 &lt;v Esme>Hee!&lt;/v> &lt;i>laughter&lt;/i>

 00:06.000 --> 00:08.000
 &lt;v.loud Mary>That's awesome!
 </pre>

 <p>Notice that as a special exception, the voice spans don't have to be closed if they cover the
 entire cue text.</p>

 <p>Style sheets can style these spans:</p>

 <pre>
 ::cue(v[voice="Esme"]) { color: cyan }
 ::cue(v[voice="Mary"]) { color: lime }
 ::cue(i) { font-style: italic }
 ::cue(.loud) { font-size: 2em }
 </pre>

</div>

<div class="example">
 <p>This example shows how to position cues at explicit positions in the video viewport.</p>

 <pre>
 WEBVTT

 00:00:00.000 --> 00:00:04.000 position:10%,line-left align:left size:35%
 Where did he go?

 00:00:03.000 --> 00:00:06.500 position:90% align:right size:35%
 I think he went down this lane.

 00:00:04.000 --> 00:00:06.500 position:45%,line-right align:center size:35%
 What are you waiting for?
 </pre>

 <p>Since the cues in these examples are horizontal, the "position" setting refers to a percentage
 of the width of the video viewpoint. If the text were vertical, the "position" setting would refer
 to the height of the video viewport.</p>

 <p>The "line-left" or "line-right" only refers to the physical side of the box to which the
 "position" setting applies, in a way which is agnostic regarding the horizontal or vertical
 direction of the cue. It does not affect or relate to the direction or position of the text itself
 within the box.</p>

 <p>The cues cover only 35% of the video viewport's width - that's the <a lt="WebVTT cue box">cue
 box</a>'s "size" for all three cues.</p>

 <p>The first cue has its <a lt="WebVTT cue box">cue box</a> positioned at the 10% mark. The
 "line-left" and "line-right" within the "position" setting indicates which side of the <a
 lt="WebVTT cue box">cue box</a> the position refers to. Since in this case the text is horizontal,
 "line-left" refers to the left side of the box, and the cue box is thus positioned between the 10%
 and the 45% mark of the video viewport's width, probably underneath a speaker on the left of the
 video image. If the cue was vertical, "line-left" positioning would be from the top of the video
 viewport's height and the <a lt="WebVTT cue box">cue box</a> would cover 35% of the video
 viewport's height.</p>

 <p>The text within the first cue's cue box is aligned using the "align" cue setting. For
 left-to-right rendered text, "start" alignment is the left of that box, for right-to-left rendered
 text the right of the box. So, independent of the directionality of the text, it will stay
 underneath that speaker. Note that "center" position alignment of the cue box is the default for
 start aligned text, in order to avoid having the box move when the base direction of the text
 changes (from left-to-right to right-to-left or vice versa) as a result of translation.</p>

 <p>The second cue has its <a lt="WebVTT cue box">cue box</a> right aligned at the 90% mark of the
 video viewport width ("right" aligned text right aligns the box). The same effect can be achieved
 with "position:55%,line-left", which explicitly positions the cue box. The third cue has center
 aligned text within the same positioned cue box as the first cue.</p>

</div>

<div class="example">
 <p>This example shows two regions containing rollup captions for two different speakers. Fred's
 cues scroll up in a region in the left half of the video, Bill's cues scroll up in a region on the
 right half of the video. Fred's first cue disappears at 12.5sec even though it is defined until
 20sec because its region is limited to 3 lines and at 12.5sec a fourth cue appears:</p>

 <pre>
 WEBVTT

 REGION
 id:fred
 width:40%
 lines:3
 regionanchor:0%,100%
 viewportanchor:10%,90%
 scroll:up

 REGION
 id:bill
 width:40%
 lines:3
 regionanchor:100%,100%
 viewportanchor:90%,90%
 scroll:up

 00:00:00.000 --> 00:00:20.000 region:fred align:left
 &lt;v Fred>Hi, my name is Fred

 00:00:02.500 --> 00:00:22.500 region:bill align:right
 &lt;v Bill>Hi, I'm Bill

 00:00:05.000 --> 00:00:25.000 region:fred align:left
 &lt;v Fred>Would you like to get a coffee?

 00:00:07.500 --> 00:00:27.500 region:bill align:right
 &lt;v Bill>Sure! I've only had one today.

 00:00:10.000 --> 00:00:30.000 region:fred align:left
 &lt;v Fred>This is my fourth!

 00:00:12.500 --> 00:00:32.500 region:fred align:left
 &lt;v Fred>OK, let's go.
 </pre>

 <p>Note that regions are only defined for horizontal cues.</p>

</div>

<h3 id=introduction-comments>Comments in WebVTT</h3>

<p><i>This section is non-normative.</i></p>

<p>Comments can be included in WebVTT files.</p>

<p>Comments are just blocks that are preceded by a blank line, start with the word
"<code>NOTE</code>" (followed by a space or newline), and end at the first blank line.</p>

<div class="example">

 <p>Here, a one-line comment is used to note a possible problem with a cue.</p>

 <pre>
 WEBVTT

 00:01.000 --> 00:04.000
 Never drink liquid nitrogen.

 NOTE I'm not sure the timing is right on the following cue.

 00:05.000 --> 00:09.000
 &mdash; It will perforate your stomach.
 &mdash; You could die.
 </pre>

</div>

<div class="example">

 <p>In this example, the author has written many comments.</p>

 <pre>
 WEBVTT

 NOTE
 This file was written by Jill. I hope
 you enjoy reading it. Some things to
 bear in mind:
 - I was lip-reading, so the cues may
 not be 100% accurate
 - I didn't pay too close attention to
 when the cues should start or end.

 00:01.000 --> 00:04.000
 Never drink liquid nitrogen.

 NOTE check next cue

 00:05.000 --> 00:09.000
 &mdash; It will perforate your stomach.
 &mdash; You could die.

 NOTE end of file
 </pre>

</div>

<h3 id=introduction-chapters>Chapters example</h3>

<p><i>This section is non-normative.</i></p>

<p>A WebVTT file can consist of chapters, which are navigation markers for the video.</p>

<p>Chapters are plain text, typically just a single line.</p>

<div class="example">

 <p>In this example, a talk is split into each slide being a chapter.</p>

 <pre>
 WEBVTT

 NOTE
 This is from a talk Silvia gave about WebVTT.

 Slide 1
 00:00:00.000 --> 00:00:10.700
 Title Slide

 Slide 2
 00:00:10.700 --> 00:00:47.600
 Introduction by Naomi Black

 Slide 3
 00:00:47.600 --> 00:01:50.100
 Impact of Captions on the Web

 Slide 4
 00:01:50.100 --> 00:03:33.000
 Requirements of a Video text format
 </pre>

</div>

<h3 id=introduction-metadata>Metadata example</h3>

<p><i>This section is non-normative.</i></p>

<p>A WebVTT file can consist of time-aligned metadata.</p>

<p>Metadata can be any string and is often provided as a JSON construct.</p>

<p>Note that you cannot provide blank lines inside a metadata block, because the blank line
signifies the end of the WebVTT cue.</p>

<div class="example">

 <p>In this example, a talk is split into each slide being a chapter.</p>

 <pre>
 WEBVTT

 NOTE
 Thanks to http://output.jsbin.com/mugibo

 1
 00:00:00.100 --> 00:00:07.342
 {
  "type": "WikipediaPage",
  "url": "https://en.wikipedia.org/wiki/Samurai_Pizza_Cats"
 }

 2
 00:07.810 --> 00:09.221
 {
  "type": "WikipediaPage",
  "url" :"http://samuraipizzacats.wikia.com/wiki/Samurai_Pizza_Cats_Wiki"
 }

 3
 00:11.441 --> 00:14.441
 {
  "type": "LongLat",
  "lat" : "36.198269",
  "long": "137.2315355"
 }
 </pre>

</div>


<h2 id=conformance>Conformance</h2>

<p>All diagrams, examples, and notes in this specification are non-normative, as are all sections
explicitly marked non-normative. Everything else in this specification is normative.</p>

<p>The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", "MAY", and "OPTIONAL" in the normative
parts of this document are to be interpreted as described in RFC2119. The key word "OPTIONALLY" in
the normative parts of this document is to be interpreted with the same normative meaning as "MAY"
and "OPTIONAL". For readability, these words do not appear in all uppercase letters in this
specification. [[!RFC2119]]</p>

<p>Requirements phrased in the imperative as part of algorithms (such as "strip any leading space
characters" or "return false and abort these steps") are to be interpreted with the meaning of the
key word ("must", "should", "may", etc) used in introducing the algorithm.</p>

<p>Conformance requirements phrased as algorithms or specific steps may be implemented in any
manner, so long as the end result is equivalent. (In particular, the algorithms defined in this
specification are intended to be easy to follow, and not intended to be performant.)</p>


<h3 id=conformance-classes>Conformance classes</h3>

<p>This specification describes the conformance criteria for user agents (relevant to implementors)
and <a>WebVTT files</a> (relevant to authors and authoring tool implementors).</p>

<p class=note>[[#syntax]] defines what consists of a valid <a>WebVTT file</a>. Authors need to
follow the requirements therein, and are encouraged to use a conformance checker. [[#parsing]]
defines how user agents are to interpret a file labelled as <a>text/vtt</a>, for both valid and
invalid <a>WebVTT files</a>. The parsing rules are more tolerant to author errors than the syntax
allows, in order to provide for extensibility and to still render cues that have some syntax
errors.</p>

<p class=example>For example, the parser will create two cues even if the blank line between them is
skipped. This is clearly a mistake, so a conformance checker will flag it as an error, but it is
still useful to render the cues to the user.</p>

<p>User agents fall into several (possibly overlapping) categories with different conformance
requirements.</p>

<dl>
 <dt>User agents that support scripting</dt>
 <dd><p>All processing requirements in this specification apply. The user agent must also be
 conforming implementations of the IDL fragments in this specification, as described in the Web IDL
 specification. [[!WEBIDL]]</p></dd>

 <dt>User agents with no scripting support</dt>
 <dd><p>All processing requirements in this specification apply, except those in
 [[#dom-construction-rules]] and [[#api]].</p></dd>

 <dt><dfn>User agents that do not support CSS</dfn></dt>
 <dd><p>All processing requirements in this specification apply, except parts of [[#parsing]] that
 relate to stylesheets and CSS, and all of [[#rendering]] and [[#css-extensions]]. The user agent
 must instead only render the text inside <a>WebVTT caption or subtitle cue text</a> in an
 appropriate manner and specifically support the color classes defined in [[#default-classes]]. Any
 other styling instructions are optional.</p> </dd>

 <dt><dfn>User agents that do not support a full HTML CSS engine</dfn></dt>
 <dd><p>All processing requirements in this specification apply, including the color classes defined
 in [[#default-classes]]. However, the user agent will need to apply the CSS related features in
 [[#parsing]], [[#rendering]] and [[#css-extensions]] in such a way that the rendered results are
 equivalent to what a full CSS supporting renderer produces.</p></dd>

 <dt><dfn>User agents that support a full HTML CSS engine</dfn></dt>
 <dd><p>All processing requirements in this specification apply. However, only a limited set of CSS
 styles is allowed because <a>user agents that do not support a full HTML CSS engine</a> will need
 to implement CSS functionality equivalents. User agents that support a full CSS engine must
 therefore limit the CSS styles they apply for WebVTT so as to enable identical rendering without
 bleeding in extra CSS styles that are beyond the WebVTT specification.</p></dd>

 <dt>Conformance checkers</dt>
 <dd><p>Conformance checkers must verify that a <a>WebVTT file</a> conforms to the applicable
 conformance criteria described in this specification. The term "validator" is equivalent to
 conformance checker for the purpose of this specification.</p></dd>

 <dt>Authoring tools</dt>
 <dd>
  <p>Authoring tools must generate conforming <a>WebVTT files</a>. Tools that convert other formats
  to <a>WebVTT</a> are also considered to be authoring tools.</p>

  <p>When an authoring tool is used to edit a non-conforming <a>WebVTT file</a>, it may preserve the
  conformance errors in sections of the file that were not edited during the editing session (i.e.
  an editing tool is allowed to round-trip erroneous content). However, an authoring tool must not
  claim that the output is conformant if errors have been so preserved.</p>
 </dd>
</dl>


<h3 id=unicode-normalization>Unicode normalization</h3>

<p>Implementations of this specification must not normalize Unicode text during processing.</p>

<p class=example>For example, a cue with an identifier consisting of the characters U+0041 LATIN
CAPITAL LETTER A followed by U+030A COMBINING RING ABOVE (a decomposed character sequence), or the
character U+212B ANGSTROM SIGN (a compatibility character), will not match a selector targeting a
cue with an ID consisting of the character U+00C5 LATIN CAPITAL LETTER A WITH RING ABOVE (a
precomposed character).</p>


<h2 id=data-model>Data model</h2>
<!-- Describe metadata, caption/subtitle, chapter & description cues -->

<p class=note>The box model of WebVTT consists of three key elements: the video viewport, cues, and
regions. The video viewport is the rendering area into which cues and regions are rendered. Cues are
boxes consisting of a set of cue lines. Regions are subareas of the video viewport that are used to
group cues together. Cues are positioned either inside the video viewport directly or inside a
region, which is positioned inside the video viewport.</p>

<p class=note>The position of a cue inside the video viewport is defined by a set of cue settings.
The position of a region inside the video viewport is defined by a set of region settings. Cues that
are inside regions can only use a limited set of their cue settings. Specifically, if the cue has a
"vertical", "line" or "size" setting, the cue drops out of the region. Otherwise, the cue's width is
calculated to be relative to the region width rather than the viewport. </p>

<h3 id=model-overview>Overview</h3>

<p><i>This section is non-normative.</i></p>

<p>The WebVTT file is a container file for chunks of data that are time-aligned with a video or
audio resource. It can therefore be regarded as a serialisation format for time-aligned data.</p>

<p>A WebVTT file starts with a header and then contains a series of data blocks. If a data block has
a start and end time, it is called a WebVTT cue. A comment is another kind of data block.</p>

<p>Different kinds of data can be carried in WebVTT files. The HTML specification identifies
captions, subtitles, chapters, audio descriptions and metadata as data kinds and specifies which one
is being used in the <a>text track kind</a> attribute of the <a>text track</a> element
[[!HTML]].</p>

<p>A WebVTT file must only contain data of one kind, never a mix of different kinds of data. The
data kind of a WebVTT file is externally specified, such as in a HTML file's <a>text track</a>
element. The environment is responsible for interpreting the data correctly.</p>

<p>WebVTT caption or subtitle cues are rendered as overlays on top of a video viewport or into a
region, which is a subarea of the video viewport.</p>

<h3 id=model-cues>WebVTT cues</h3>

<p>A <dfn>WebVTT cue</dfn> is a <a>text track cue</a> [[!HTML]] that additionally consist of the
following: </p>

<dl>

 <dt><dfn lt="cue text">A cue text</dfn></dt>
 <dd>
  <p>The raw text of the cue, and rules for its interpretation.</p>

 </dd>

</dl>


<h3 id=cues>WebVTT caption or subtitle cues</h3>

<p>A <dfn>WebVTT caption or subtitle cue</dfn> is a <a>WebVTT cue</a> that has the following
additional properties allowing the <a>cue text</a> to be rendered and converted to a DOM
fragment:</p>

<dl>

 <dt><dfn lt="WebVTT cue box">A cue box</dfn></dt>
 <dd>
  <p>The cue box of a <a>WebVTT cue</a> is a box within which the text of all lines of the cue is to
  be rendered. It is either rendered into the video's viewport or a region inside the viewport if
  the cue is part of a region.</p>

  <p class="note">The position of the <a lt="WebVTT cue box">cue box</a> within the video viewport's
  or region's dimensions depends on the value of the <a>WebVTT cue position</a> and the <a>WebVTT
  cue line</a>.</p>

  <p class="note">Lines are wrapped within the <a lt="WebVTT cue box">cue box</a>'s <a lt="WebVTT
  cue size">size</a> if lines' lengths make this necessary.</p>

 </dd>

 <dt><dfn lt="WebVTT cue writing direction">A writing direction</dfn></dt>
 <dd>
  <p>A writing direction, either</p>
  <ul>

   <li><dfn lt="WebVTT cue horizontal writing direction">horizontal</dfn> (a line extends
   horizontally and is offset vertically from the video viewport's top edge, with consecutive lines
   displayed below each other),</li>

   <li><dfn lt="WebVTT cue vertical growing left writing direction">vertical growing left</dfn> (a
   line extends vertically and is offset horizontally from the video viewport's right edge, with
   consecutive lines displayed to the left of each other<!-- used for east asian-->), or</li>

   <li><dfn lt="WebVTT cue vertical growing right writing direction">vertical growing right</dfn> (a
   line extends vertically and is offset horizontally from the video viewport's left edge, with
   consecutive lines displayed to the right of each other<!-- used for mongolian -->).</li>

  </ul>

  <p class=note>The <a lt="WebVTT cue writing direction">writing direction</a> affects the
  interpretation of the <a lt="WebVTT cue line">line</a>, <a lt="WebVTT cue position">position</a>,
  and <a lt="WebVTT cue size">size</a> cue settings to be interpreted with respect to either the
  width or height of the video.</p>

  <p>By default, the <a lt="WebVTT cue writing direction">writing direction</a> is set to to <a
  lt="WebVTT cue horizontal writing direction">horizontal</a>.</p>

  <p class=note>The <a lt="WebVTT cue vertical growing left writing direction">vertical growing
  left</a> writing direction could be used for vertical Chinese, Japanese, and Korean, and the <a
  lt="WebVTT cue vertical growing right writing direction">vertical growing right</a> writing
  direction could be used for vertical Mongolian.</p>

 </dd>

 <dt><dfn lt="WebVTT cue snap-to-lines flag">A snap-to-lines flag</dfn></dt>
 <dd>

  <p>A boolean indicating whether the <a lt="WebVTT cue line">line</a> is an integer number of lines
  (using the line dimensions of the first line of the cue), or whether it is a percentage of the
  dimension of the video. The flag is set to true when lines are counted, and false otherwise.</p>

  <p>Cues where the flag is false will be offset as requested modulo overlap avoidance if multiple
  cues are in the same place.</p>

  <p>By default, the <a lt="WebVTT cue snap-to-lines flag">snap-to-lines flag</a> is set to
  true.</p>

 </dd>

 <dt><dfn lt="WebVTT cue line">A line</dfn></dt>
 <dd>
  <p>The <a lt="WebVTT cue line">line</a> defines positioning of the <a lt="WebVTT cue box">cue
  box</a>.</p>

  <p>The <a lt="WebVTT cue line">line</a> offsets the <a lt="WebVTT cue box">cue box</a> from the
  top, the right or left of the video viewport as defined by the <a lt="WebVTT cue writing
  direction">writing direction</a>, the <a lt="WebVTT cue snap-to-lines flag">snap-to-lines
  flag</a>, or the lines occupied by any other showing tracks.</p>

  <p>The <a lt="WebVTT cue line">line</a> is set either as a number of lines, a percentage of the
  video viewport height or width, or as the special value <dfn lt="WebVTT cue line
  automatic">auto</dfn>, which means the offset is to depend on the other showing tracks.</p>

  <p>By default, the <a lt="WebVTT cue line">line</a> is set to <a lt="WebVTT cue line
  automatic">auto</a>.</p>

  <p>If the <a lt="WebVTT cue writing direction">writing direction</a> is <a lt="WebVTT cue
  horizontal writing direction">horizontal</a>, then the <a lt="WebVTT cue line">line</a>
  percentages are relative to the height of the video, otherwise to the width of the video.</p>

  <p>A <a>WebVTT cue</a> has a <dfn lt="cue computed line">computed line</dfn> whose value is that
  returned by the following algorithm, which is defined in terms of the other aspects of the
  cue:</p>

  <ol algorithm="computed line">

   <li>

    <p>If the <a lt="WebVTT cue line">line</a> is numeric, the <a>WebVTT cue snap-to-lines flag</a>
    of the <a>WebVTT cue</a> is false, and the <a lt="WebVTT cue line">line</a> is negative or
    greater than 100, then return 100 and abort these steps.</p>

    <p class="note">Although the <a>WebVTT parser</a> will not set the <a lt="WebVTT cue
    line">line</a> to a number outside the range 0..100 and also set the <a>WebVTT cue snap-to-lines
    flag</a> to false, this can happen when using the DOM API's {{VTTCue/snapToLines}} and
    {{VTTCue/line}} attributes.</p>

   </li>

   <li><p>If the <a lt="WebVTT cue line">line</a> is numeric, return the value of the <a>WebVTT cue
   line</a> and abort these steps. (Either the <a>WebVTT cue snap-to-lines flag</a> is true, so any
   value, not just those in the range 0..100, is valid, or the value is in the range 0..100 and is
   thus valid regardless of the value of that flag.)</p></li>

   <li><p>If the <a>WebVTT cue snap-to-lines flag</a> of the <a>WebVTT cue</a> is false, return the
   value 100 and abort these steps. (The <a lt="WebVTT cue line">line</a> is the special value <a
   lt="WebVTT cue line automatic">auto</a>.)</p></li>

   <li><p>Let |cue| be the <a>WebVTT cue</a>.</p></li>

   <li><p>If |cue| is not in a <a lt="text track list of cues">list of cues</a> of a <a>text
   track</a>, or if that <a>text track</a> is not in the <a>list of text tracks</a> of a <a>media
   element</a>, return &#x2212;1 and abort these steps.</p></li>

   <li><p>Let |track| be the <a>text track</a> whose <a lt="text track list of cues">list of
   cues</a> the |cue| is in.</p></li>

   <li><p>Let |n| be the number of <a>text tracks</a> whose <a>text track mode</a> is <a lt="text
   track showing">showing</a> and that are in the <a>media element</a>'s <a>list of text tracks</a>
   before |track|.</p></li>

   <li><p>Increment |n| by one.</p></li>

   <li><p>Negate |n|.</p></li>

   <li><p>Return |n|.</p></li>

  </ol>

  <p class="example">For example, if two <a>text tracks</a> are <a lt="text track
  showing">showing</a> at the same time in one <a>media element</a>, and each <a>text track</a>
  currently has an active <a>WebVTT cue</a> whose <a lt="WebVTT cue line">line</a> are both <a
  lt="WebVTT cue line automatic">auto</a>, then the first <a>text track</a>'s cue's <a lt="cue
  computed line">computed line</a> will be &#x2212;1 and the second will be &#x2212;2.</p>

 </dd>

 <dt><dfn lt="WebVTT cue line alignment">A line alignment</dfn></dt>
 <dd>
  <p>An alignment for the <a lt="WebVTT cue box">cue box</a>'s <a lt="WebVTT cue line">line</a>, one
  of:</p>

  <dl>

   <dt><dfn lt="WebVTT cue line start alignment">Start alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a>'s top side (for <a lt="WebVTT cue horizontal writing
   direction">horizontal</a> cues), left side (for <a lt="WebVTT cue vertical growing right writing
   direction">vertical growing right</a>), or right side (for <a lt="WebVTT cue vertical growing
   left writing direction">vertical growing left</a>) is aligned at the <a lt="WebVTT cue
   line">line</a>.</dd>

   <dt><dfn lt="WebVTT cue line center alignment">Center alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a> is centered at the <a lt="WebVTT cue
   line">line</a>.</dd>

   <dt><dfn lt="WebVTT cue line end alignment">End alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a>'s bottom side (for <a lt="WebVTT cue horizontal
   writing direction">horizontal</a> cues), right side (for <a lt="WebVTT cue vertical growing right
   writing direction">vertical growing right</a>), or left side (for <a lt="WebVTT cue vertical
   growing left writing direction">vertical growing left</a>) is aligned at the <a lt="WebVTT cue
   line">line</a>.</dd>

  </dl>

  <p>By default, the <a lt="WebVTT cue line alignment">line alignment</a> is set to <a lt="WebVTT
  cue line start alignment">start</a>.</p>

  <p class=note>The <a lt="WebVTT cue line alignment">line alignment</a> is separate from the <a
  lt="WebVTT cue text alignment">text alignment</a> &mdash; right-to-left vs. left-to-right cue text
  does not affect the <a lt="WebVTT cue line alignment">line alignment</a>.</p>

 </dd>

 <dt><dfn lt="WebVTT cue position">A position</dfn></dt>
 <dd>
  <p>The <a lt="WebVTT cue position">position</a> defines the indent of the <a lt="WebVTT cue
  box">cue box</a> in the direction defined by the <a lt="WebVTT cue writing direction">writing
  direction</a>.</p>

  <p>The <a lt="WebVTT cue position">position</a> is either a number giving the position of the <a
  lt="WebVTT cue box">cue box</a> as a percentage value or the special value <dfn lt="WebVTT cue
  automatic position">auto</dfn>, which means the position is to depend on the <a lt="WebVTT cue
  text alignment">text alignment</a> of the cue.</p>

  <p>If the cue is not within a <a lt="WebVTT region">region</a>, the percentage value is to be
  interpreted as a percentage of the video dimensions, otherwise as a percentage of the region
  dimensions.</p>

  <p>By default, the <a lt="WebVTT cue position">position</a> is set to <a lt="WebVTT cue automatic
  position">auto</a>.</p>

  <p>If the <a lt="WebVTT cue writing direction">writing direction</a> is <a lt="WebVTT cue
  horizontal writing direction">horizontal</a>, then the <a lt="WebVTT cue position">position</a>
  percentages are relative to the width of the video, otherwise to the height of the video.</p>

  <p>A <a>WebVTT cue</a> has a <dfn lt="cue computed position">computed position</dfn> whose value
  is that returned by the following algorithm, which is defined in terms of the other aspects of the
  cue:</p>

  <ol algorithm="computed position">

   <li><p>If the <a lt="WebVTT cue position">position</a> is numeric between 0 and 100, then return
   the value of the <a lt="WebVTT cue position">position</a> and abort these steps. (Otherwise, the
   <a lt="WebVTT cue position">position</a> is the special value <a lt="WebVTT cue automatic
   position">auto</a>.)</p></li>

   <li><p>If the <a lt="WebVTT cue text alignment">cue text alignment</a> is <a lt="WebVTT cue left
   alignment">left</a>, return 0 and abort these steps.</p></li>

   <li><p>If the <a lt="WebVTT cue text alignment">cue text alignment</a> is <a lt="WebVTT cue right
   alignment">right</a>, return 100 and abort these steps.</p></li>

   <li><p>Otherwise, return 50 and abort these steps.</p></li>

  </ol>

  <p class="note">Since the default value of the <a>WebVTT cue position alignment</a> is <a
  lt="WebVTT cue center alignment">center</a>, if there is no <a>WebVTT cue text alignment</a>
  setting for a cue, the <a>WebVTT cue position</a> defaults to 50%.</p>

  <p class="note">Even for <a lt="WebVTT cue horizontal writing direction">horizontal</a> cues with
  right-to-left cue text, the <a lt="WebVTT cue box">cue box</a> is positioned from the left edge of
  the video viewport. This allows defining a rendering space template which can be filled with
  either left-to-right or right-to-left cue text, or both.</p>

  <p>For <a>WebVTT cues</a> that have a <a lt="WebVTT cue size">size</a> other than 100%, and a <a
  lt="WebVTT cue text alignment">text alignment</a> of <a lt="WebVTT cue start alignment">start</a>
  or <a lt="WebVTT cue end alignment">end</a>, authors must not use the default <a lt="WebVTT cue
  automatic position">auto</a> <a lt="WebVTT cue position">position</a>.</p>

  <p class="note">When the <a lt="WebVTT cue text alignment">text alignment</a> is <a lt="WebVTT cue
  start alignment">start</a> or <a lt="WebVTT cue end alignment">end</a>, the <a lt="WebVTT cue
  automatic position">auto</a> <a lt="WebVTT cue position">position</a> is 50%. This is different
  from <a lt="WebVTT cue left alignment">left</a> and <a lt="WebVTT cue right alignment">right</a>
  aligned text, where the <a lt="WebVTT cue automatic position">auto</a> <a lt="WebVTT cue
  position">position</a> is 0% and 100%, respectively. The above requirement is present because it
  can be surprising that automatic positioning doesn't work for <a lt="WebVTT cue start
  alignment">start</a> or <a lt="WebVTT cue end alignment">end</a> aligned text. Since <a>cue
  text</a> can consist of text with left-to-right base direction, or right-to-left base direction,
  or both (on different lines), such automatic positioning would have unexpected results.</p>

 </dd>

 <dt><dfn lt="WebVTT cue position alignment">A position alignment</dfn></dt>
 <dd>
  <p>An alignment for the <a lt="WebVTT cue box">cue box</a> in the dimension of the <a lt="WebVTT
  cue writing direction">writing direction</a>, describing what the <a lt="WebVTT cue
  position">position</a> is anchored to, one of:</p>

  <dl>

   <dt><dfn lt="WebVTT cue position line-left alignment">Line-left alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a>'s left side (for <a lt="WebVTT cue horizontal writing
   direction">horizontal</a> cues) or top side (otherwise) is aligned at the <a lt="WebVTT cue
   position">position</a>.</dd>

   <dt><dfn lt="WebVTT cue position center alignment">Center alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a> is centered at the <a lt="WebVTT cue
   position">position</a>.</dd>

   <dt><dfn lt="WebVTT cue position line-right alignment">Line-right alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a>'s right side (for <a lt="WebVTT cue horizontal writing
   direction">horizontal</a> cues) or bottom side (otherwise) is aligned at the <a lt="WebVTT cue
   position">position</a>.</dd>

   <dt><dfn lt="WebVTT cue position automatic alignment">Auto alignment</dfn></dt>
   <dd>The <a lt="WebVTT cue box">cue box</a>'s alignment depends on the value of the <a lt="WebVTT
   cue text alignment">text alignment</a> of the cue.</dd>

  </dl>

  <p>By default, the <a lt="WebVTT cue position alignment">position alignment</a> is set to <a
  lt="WebVTT cue position automatic alignment">auto</a>.</p>

  <p>A <a>WebVTT cue</a> has a <dfn lt="cue computed position alignment">computed position
  alignment</dfn> whose value is that returned by the following algorithm, which is defined in terms
  of other aspects of the cue:</p>

  <ol algorithm="computed position alignment">

   <li><p>If the <a>WebVTT cue position alignment</a> is not <a lt="WebVTT cue position automatic
   alignment">auto</a>, then return the value of the <a>WebVTT cue position alignment</a> and abort
   these steps.</p></li>

   <li><p>If the <a>WebVTT cue text alignment</a> is <a lt="WebVTT cue left alignment">left</a>,
   return <a lt="WebVTT cue position line-left alignment">line-left</a> and abort these
   steps.</p></li>

   <li><p>If the <a>WebVTT cue text alignment</a> is <a lt="WebVTT cue right alignment">right</a>,
   return <a lt="WebVTT cue position line-right alignment">line-right</a> and abort these
   steps.</p></li>

   <li><p>If the <a>WebVTT cue text alignment</a> is <a lt="WebVTT cue start alignment">start</a>,
   return <a lt="WebVTT cue position line-left alignment">line-left</a> if the base direction of the
   cue text is left-to-right, <a lt="WebVTT cue position line-right alignment">line-right</a>
   otherwise.</p></li>

   <li><p>If the <a>WebVTT cue text alignment</a> is <a lt="WebVTT cue end alignment">end</a>,
   return <a lt="WebVTT cue position line-right alignment">line-right</a> if the base direction of
   the cue text is left-to-right, <a lt="WebVTT cue position line-left alignment">line-left</a>
   otherwise.</p></li>

   <li><p>Otherwise, return <a lt="WebVTT cue center alignment">center</a>.</p></li>

  </ol>

  <p class="note">Since the <a lt="WebVTT cue position">position</a> always measures from the left
  of the video (for <a lt="WebVTT cue horizontal writing direction">horizontal</a> cues) or the top
  (otherwise), the <a>WebVTT cue position alignment</a> <a lt="WebVTT cue position line-left
  alignment">line-left</a> value varies between left and top for horizontal and vertical cues.</p>

 </dd>

 <dt><dfn lt="WebVTT cue size">A size</dfn></dt>
 <dd>
  <p>A number giving the size of the <a lt="WebVTT cue box">cue box</a>, to be interpreted as a
  percentage of the video, as defined by the <a lt="WebVTT cue writing direction">writing
  direction</a>.</p>

  <p>By default, the <a>WebVTT cue size</a> is set to 100%.</p>

  <p>If the <a lt="WebVTT cue writing direction">writing direction</a> is <a lt="WebVTT cue
  horizontal writing direction">horizontal</a>, then the <a lt="WebVTT cue size">size</a>
  percentages are relative to the width of the video, otherwise to the height of the video.</p>

 </dd>

 <dt><dfn lt="WebVTT cue text alignment">A text alignment</dfn></dt>
 <dd>

  <p>An alignment for all lines of text within the <a lt="WebVTT cue box">cue box</a>, in the
  dimension of the <a lt="WebVTT cue writing direction">writing direction</a>, one of:</p>

  <dl>

   <dt><dfn lt="WebVTT cue start alignment">Start alignment</dfn></dt>
   <dd>The text of each line is individually aligned towards the start side of the box, where the
   start side for that line is determined by using the CSS rules for ''unicode-bidi/plaintext''
   value of the 'unicode-bidi' property. [[!CSS-WRITING-MODES-3]]</dd>

   <dt><dfn lt="WebVTT cue center alignment">Center alignment</dfn></dt>
   <dd>The text is aligned centered between the box's start and end sides.</dd>

   <dt><dfn lt="WebVTT cue end alignment">End alignment</dfn></dt>
   <dd>The text of each line is individually aligned towards the end side of the box, where the end
   side for that line is determined by using the CSS rules for ''unicode-bidi/plaintext'' value of
   the 'unicode-bidi' property. [[!CSS-WRITING-MODES-3]]</dd>

   <dt><dfn lt="WebVTT cue left alignment">Left alignment</dfn></dt>
   <dd>The text is aligned to the box's left side (for <a lt="WebVTT cue horizontal writing
   direction">horizontal</a> cues) or top side (otherwise).</dd>

   <dt><dfn lt="WebVTT cue right alignment">Right alignment</dfn></dt>
   <dd>The text is aligned to the box's right side (for <a lt="WebVTT cue horizontal writing
   direction">horizontal</a> cues) or bottom side (otherwise).</dd>

  </dl>

  <p>By default, the <a lt="WebVTT cue text alignment">text alignment</a> is set to <a lt="WebVTT
  cue center alignment">center</a>.</p>

  <p class=note>The base direction of each line in a cue (which is used by the Unicode Bidirectional
  Algorithm to determine the order in which to display the characters in the line) is determined by
  looking up the first strong directional character in each line, using the CSS
  ''unicode-bidi/plaintext'' algorithm. In the occasional cases where the first strong character on
  a line would produce the wrong base direction for that line, the author can use an U+200E
  LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK character at the start of the line to correct it.
  [[!BIDI]]</p>

  <div class=example>
   <p>In this example, the second cue will have a right-to-left base direction, rendering as
   "<samp><bdo dir=ltr>.I think ,&#x064A;&#x0644;&#x0627;&#x0639;</bdo></samp>". (Note that the text
   below shows all characters left-to-right; a text editor would not necessarily have the same
   rendering.)</p>

   <pre>
   WEBVTT

   00:00:07.000 --> 00:00:09.000
   What was his name again?

   00:00:09.000 --> 00:00:11.000
   <bdo dir=ltr>&#x0639;&#x0627;&#x0644;&#x064A;, I think.</bdo>
   </pre>

   <p>To change that line to left-to-right base direction, start the line with an U+200E
   LEFT-TO-RIGHT MARK character (it can be escaped as "<code>&amp;lrm;</code>").</p>
  </div>

  <p class=note>Where the base direction of some embedded text within a line needs to be different
  from the surrounding text on that line, this can be achieved by using the paired Unicode bidi
  formatting code characters.</p>

  <div class=example>
   <p>In this example, assuming no bidi formatting code characters are used, the cue text is
   rendered as "<samp><bdo dir=ltr>I've read the book 3 &#x5D3;&#x5E0;&#x5DC;&#x5D9;&#x5D5;&#x5E0;
   times!</bdo></samp>" (i.e. the "3" is on the wrong side of the book title) because of the effect
   of the Unicode Bidirection Algorithm. (Again, the text below shows all characters
   left-to-right.)</p>

   <pre>
   WEBVTT

   00:00:04.000 --> 00:00:08.000
   <bdo dir=ltr>I've read the book &#x5E0;&#x5D5;&#x5D9;&#x5DC;&#x5E0;&#x5D3; 3 times!</bdo>
   </pre>

   <p>If a U+2068 FIRST STRONG ISOLATE (FSI) character was placed before the book title and a U+2069
   POP DIRECTIONAL ISOLATE (PDI) character after it, the rendering would be the intended "<samp><bdo
   dir=ltr>I've read the book &#x5D3;&#x5E0;&#x5DC;&#x5D9;&#x5D5;&#x5E0; 3 times!</bdo></samp>".
   (Those characters can be escaped as "<code>&amp;#x2068;</code>" and "<code>&amp;#x2069;</code>",
   respectively.)</p>
  </div>

  <p class=note>The default text alignment is <a lt="WebVTT cue center alignment">center
  alignment</a> regardless of the base direction of the cue text. To make the text alignment of each
  line match the base direction of the line (e.g. left for English, right for Hebrew), use <a
  lt="WebVTT cue start alignment">start alignment</a>, or <a lt="WebVTT cue end alignment">end
  alignment</a> for the opposite alignment.</p>

  <div class=example>
   <p>In this example, <a lt="WebVTT cue start alignment">start alignment</a> is used. The first
   line is left-aligned because the base direction is left-to-right, and the second line is
   right-aligned because the base direction is right-to-left.</p>

   <pre>
   WEBVTT

   00:00:00.000 --> 00:00:05.000 align:start
   Hello!
   <bdo dir=ltr>&#x5E9;&#x5DC;&#x5D5;&#x5DD;!</bdo>
   </pre>

   <p>This would render as follows:</p>

   <!-- 50 -->
   <pre>
   <samp>Hello!</samp>
                                               <samp><bdo dir=ltr>!&#x5DD;&#x5D5;&#x5DC;&#x5E9;</bdo></samp>
   </pre>
  </div>

  <p class=note>The <a lt="WebVTT cue left alignment">left alignment</a> and <a lt="WebVTT cue right
  alignment">right alignment</a> can be used to left-align or right-align the cue text regardless of
  its lines' base direction.</p>

 </dd>

 <dt><dfn lt="WebVTT cue region">A region</dfn></dt>
 <dd>
  <p>An optional <a>WebVTT region</a> to which a cue belongs.</p>

  <p>By default, the <a lt="WebVTT region">region</a> is set to null.</p>
 </dd>

</dl>

<p>The associated <a>rules for updating the text track rendering</a> of <a lt="WebVTT cue">WebVTT
cues</a> are the <a>rules for updating the display of WebVTT text tracks</a>.</p>

<div class="impl">

 <p>When a <a>WebVTT cue</a> whose <a lt="text track cue active flag">active flag</a> is set has its
 <a lt="WebVTT cue writing direction">writing direction</a>, <a lt="WebVTT cue snap-to-lines
 flag">snap-to-lines flag</a>, <a lt="WebVTT cue line">line</a>, <a lt="WebVTT cue line
 alignment">line alignment</a>, <a lt="WebVTT cue position">position</a>, <a lt="WebVTT cue position
 alignment">position alignment</a>, <a lt="WebVTT cue size">size</a>, <a lt="WebVTT cue text
 alignment">text alignment</a>, <a lt="WebVTT cue region">region</a>, or <a lt="cue text">text</a>
 change value, then the user agent must empty the <a>text track cue display state</a>, and then
 immediately run the <a>text track</a>'s <a>rules for updating the display of WebVTT text
 tracks</a>.</p>

</div>


<h3 id=regions>WebVTT caption or subtitle regions</h3>

<p>A <dfn>WebVTT region</dfn> represents a subpart of the video viewport and provides a limited
rendering area for <a lt="WebVTT caption or subtitle cue">WebVTT caption or subtitle cues</a>.</p>

<p class=note>Regions provide a means to group caption or subtitle cues so the cues can be rendered
together, which is particularly important when scrolling up.</p>

<p>Each <a>WebVTT region</a> consists of:</p>

<dl>

 <dt><dfn lt="WebVTT region identifier">An identifier</dfn></dt>
 <dd>
  <p>An arbitrary string of zero or more characters other than U+0020 SPACE or U+0009 CHARACTER
  TABULATION character. The string must not contain the substring "-->" (U+002D HYPHEN-MINUS, U+002D
  HYPHEN-MINUS, U+003E GREATER-THAN SIGN). Defaults to the empty string.</p>
 </dd>

 <dt><dfn lt="WebVTT region width">A width</dfn></dt>
 <dd>
  <p>A number giving the width of the box within which the text of each line of the containing cues
  is to be rendered, to be interpreted as a percentage of the video width. Defaults to 100.</p>
 </dd>

 <dt><dfn lt="WebVTT region lines">A lines value</dfn></dt>
 <dd>
  <p>A number giving the number of lines of the box within which the text of each line of the
  containing cues is to be rendered. Defaults to 3.</p>

  <p class="note">Since a WebVTT region defines a fixed rendering area, a cue that has more lines
  than the region allows will be clipped. For scrolling regions, the clipping happens at the top,
  for non-scrolling regions it happens at the bottom.</p>
 </dd>

 <dt><dfn lt="WebVTT region anchor">A region anchor point</dfn></dt>
 <dd>
  <p>Two numbers giving the x and y coordinates within the region which is anchored to the video
  viewport and does not change location even when the region does, e.g. because of font size
  changes. Defaults to (0,100), i.e. the bottom left corner of the region.</p>
 </dd>

 <dt><dfn lt="WebVTT region viewport anchor">A region viewport anchor point</dfn></dt>
 <dd>
  <p>Two numbers giving the x and y coordinates within the video viewport to which the region anchor
  point is anchored. Defaults to (0,100), i.e. the bottom left corner of the video viewport.</p>
 </dd>

 <dt><dfn lt="WebVTT region scroll">A scroll value</dfn></dt>
 <dd>
  <p>One of the following:</p>
  <dl>
   <dt><dfn lt="WebVTT region scroll none">None</dfn></dt>
   <dd>Indicates that the cues in the region are not to scroll and instead stay fixed at the
   location they were first painted in.</dd>

   <dt><dfn lt="WebVTT region scroll up">Up</dfn></dt>
   <dd>Indicates that the cues in the region will be added at the bottom of the region and push any
   already displayed cues in the region up until all lines of the new cue are visible in the
   region.</dd>
   <!-- in the future we may introduce scroll="down"-->
  </dl>
 </dd>
</dl>

<div class="note">
 <p>The following diagram illustrates how anchoring of a region to a video viewport works. The black
 cross is the anchor, orange explains the anchor's offset within the region and green the anchor's
 offset within the video viewport. Think of it as sticking a pin through a note onto a board:</p>
 <p><img src="webvtt-region-diagram.png" alt="visual explanation of WebVTT regions"
 longdesc=#regionsExplained width="862" height="499"></p>
 <p id=regionsExplained>Image description: Within the video viewport, there is a WebVTT region.
 Inside the region, there is an anchor point marked with a black cross. The vertical and horizontal
 distance from the video viewport's edges to the anchor is marked with green arrows, representing
 the region viewport anchor X and Y offsets. The vertical and horizontal distance from the region's
 edges to the anchor is marked with orange arrows, representing the region anchor X and Y offsets.
 The size of the region is represented by the region width for the horizontal axis, and region lines
 for the vertical axis.</p>
</div>

<p>For parsing, we also need the following:</p>

<dl>
 <dt><dfn lt="text track list of regions">A text track list of regions</dfn></dt>

 <dd>

  <p>A list of zero or more <a lt="WebVTT region">WebVTT regions</a>.</p>

 </dd>
</dl>

<h3 id=chapter-cues>WebVTT chapter cues</h3>

<p>A <dfn export>WebVTT chapter cue</dfn> is a <a>WebVTT cue</a> whose <a>cue text</a> is interpreted as a
chapter title that describes the chapter as a navigation target.</p>

<p>Chapter cues mark up the timeline of a audio or video file in consecutive, non-overlapping
intervals. It is further possible to subdivide these intervals into sub-chapters building a
navigation tree.</p>

<h3 id=metadata-cues>WebVTT metadata cues</h3>

<p>A <dfn export>WebVTT metadata cue</dfn> is a <a>WebVTT cue</a> whose <a>cue text</a> is interpreted as
time-aligned metadata.</p>


<h2 id=syntax>Syntax</h2>


<h3 id=file-structure>WebVTT file structure</h3>

<p>A <dfn>WebVTT file</dfn> must consist of a <a>WebVTT file body</a> encoded as UTF-8 and labeled
with the <a>MIME type</a> <code>text/vtt</code>. [[!RFC3629]]</p>

<p>A <dfn>WebVTT file body</dfn> consists of the following components, in the following order:</p>

<ol algorithm="WebVTT file body">

 <li>An optional U+FEFF BYTE ORDER MARK (BOM) character.</li>

 <li>The string "<code>WEBVTT</code>".</li>

 <li>Optionally, either a U+0020 SPACE character or a U+0009 CHARACTER TABULATION (tab) character
 followed by any number of characters that are not U+000A LINE FEED (LF) or U+000D CARRIAGE RETURN
 (CR) characters.</li> <!-- allows for Emacs line -->

 <li>Two or more <a lt="WebVTT line terminator">WebVTT line terminators</a> to terminate the line
 with the file magic and separate it from the rest of the body.</li>

 <li>Zero or more <a lt="WebVTT region definition block">WebVTT region definition blocks</a>, <a
 lt="WebVTT style block">WebVTT style blocks</a> and <a lt="WebVTT comment block">WebVTT comment
 blocks</a> separated from each other by one or more <a lt="WebVTT line terminator">WebVTT line
 terminators</a>.</li>

 <li>Zero or more <a lt="WebVTT line terminator">WebVTT line terminators</a>.</li>

 <li>Zero or more <a lt="WebVTT cue block">WebVTT cue blocks</a> and <a lt="WebVTT comment
 block">WebVTT comment blocks</a> separated from each other by one or more <a lt="WebVTT line
 terminator">WebVTT line terminators</a>.</li>

 <li>Zero or more <a lt="WebVTT line terminator">WebVTT line terminators</a>.</li>

</ol>

<p>A <dfn>WebVTT line terminator</dfn> consists of one of the following:</p>

<ul class="brief">
 <li>A U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair.</li>
 <li>A single U+000A LINE FEED (LF) character.</li>
 <li>A single U+000D CARRIAGE RETURN (CR) character.</li>
</ul>

<p>A <dfn>WebVTT region definition block</dfn> consists of the following components, in the given
order:</p>

<ol>
 <li>The string "<code>REGION</code>" (U+0052 LATIN CAPITAL LETTER R, U+0045 LATIN CAPITAL LETTER E,
 U+0047 LATIN CAPITAL LETTER G, U+0049 LATIN CAPITAL LETTER I, U+004F LATIN CAPITAL LETTER O, U+004E
 LATIN CAPITAL LETTER N).</li>
 <li>Zero or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters.</li>
 <li>A <a>WebVTT line terminator</a>.</li>
 <li>A <a>WebVTT region settings list</a>.</li>
 <li>A <a>WebVTT line terminator</a>.</li>
</ol>

<p>A <dfn>WebVTT style block</dfn> consists of the following components, in the given order:</p>

<ol>
 <li>The string "<code>STYLE</code>" (U+0053 LATIN CAPITAL LETTER S, U+0054 LATIN CAPITAL LETTER T,
 U+0059 LATIN CAPITAL LETTER Y, U+004C LATIN CAPITAL LETTER L, U+0045 LATIN CAPITAL LETTER E).</li>
 <li>Zero or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters.</li>
 <li>A <a>WebVTT line terminator</a>.</li>
 <li>Any sequence of zero or more characters other than U+000A LINE FEED (LF) characters and U+000D
 CARRIAGE RETURN (CR) characters, each optionally separated from the next by a <a>WebVTT line
 terminator</a>, except that the entire resulting string must not contain the substring
 "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN). The string
 represents a CSS style sheet; the requirements given in the relevant CSS specifications apply.
 [[!CSS22]]</li>
 <li>A <a>WebVTT line terminator</a>.</li>
</ol>

<p>A <dfn>WebVTT cue block</dfn> consists of the following components, in the given order:</p>

<ol>
 <li>Optionally, a <a>WebVTT cue identifier</a> followed by a <a>WebVTT line terminator</a>.</li>
 <li><a>WebVTT cue timings</a>.</li>
 <li>Optionally, one or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters
 followed by a <a>WebVTT cue settings list</a>.</li>
 <li>A <a>WebVTT line terminator</a>.</li>
 <li>The <dfn>cue payload</dfn>: either <a>WebVTT caption or subtitle cue text</a>, <a>WebVTT
 chapter title text</a>, or <a>WebVTT metadata text</a>, but it must not contain the substring
 "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN).</li>
 <li>A <a>WebVTT line terminator</a>.</li>
</ol>

<p class="note">A <a>WebVTT cue block</a> corresponds to one piece of time-aligned text or data in
the <a>WebVTT file</a>, for example one subtitle. The <a>cue payload</a> is the text or data
associated with the cue.</p>

<p>A <dfn>WebVTT cue identifier</dfn> is any sequence of one or more characters not containing the
substring "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN),
nor containing any U+000A LINE FEED (LF) characters or U+000D CARRIAGE RETURN (CR) characters.</p>

<p>A <a>WebVTT cue identifier</a> must be unique amongst all the <a lt="WebVTT cue
identifier">WebVTT cue identifiers</a> of all <a lt="WebVTT cue">WebVTT cues</a> of a <a>WebVTT
file</a>.</p>

<p class="note">A <a>WebVTT cue identifier</a> can be used to reference a specific cue, for example
from script or CSS.</p>

<p>The <dfn>WebVTT cue timings</dfn> part of a <a>WebVTT cue block</a> consists of the following
components, in the given order:</p>

<ol>

 <!-- we could allow leading and trailing spaces and tabs, and make the space between the arrow
 either optional or allow multiple spaces or tabs -->

 <li>A <a>WebVTT timestamp</a> representing the start time offset of the cue. The time represented
 by this <a>WebVTT timestamp</a> must be greater than or equal to the start time offsets of all
 previous cues in the file.</li>

 <li>One or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters.</li>

 <li>The string "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN
 SIGN).</li>

 <li>One or more U+0020 SPACE characters or U+0009 CHARACTER TABULATION (tab) characters.</li>

 <li>A <a>WebVTT timestamp</a> representing the end time offset of the cue. The time represented by
 this <a>WebVTT timestamp</a> must be greater than the start time offset of the cue.</li>

</ol>

<p class="note">The <a>WebVTT cue timings</a> give the start and end offsets of the <a>WebVTT cue
block</a>. Different cues can overlap. Cues are always listed ordered by their start time.</p>

<p>A <dfn>WebVTT timestamp</dfn> consists of the following components, in the given order:</p>

<ol>

 <li>
  Optionally (required if |hours| is non-zero):

  <ol>

   <li>Two or more <a>ASCII digits</a>, representing the |hours| as a base ten integer.</li>

   <li>A U+003A COLON character (:)</li>

  </ol>

 </li>

 <li>Two <a>ASCII digits</a>, representing the |minutes| as a base ten integer in the range
 0&nbsp;&le;&nbsp;|minutes|&nbsp;&le;&nbsp;59.</li>

 <li>A U+003A COLON character (:)</li>

 <li>Two <a>ASCII digits</a>, representing the |seconds| as a base ten integer in the range
 0&nbsp;&le;&nbsp;|seconds|&nbsp;&le;&nbsp;59.</li>

 <li>A U+002E FULL STOP character (.).</li>

 <li>Three <a>ASCII digits</a>, representing the thousandths of a second |seconds-frac| as a base
 ten integer.</li>

</ol>

<p class="note">A <a>WebVTT timestamp</a> is always interpreted relative to the <a>current playback
position</a> of the media data that the WebVTT file is to be synchronized with.</p>

<p>A <dfn>WebVTT cue settings list</dfn> consist of a sequence of zero or more <dfn lt="WebVTT cue
setting">WebVTT cue settings</dfn> in any order, separated from each other by one or more U+0020
SPACE characters or U+0009 CHARACTER TABULATION (tab) characters. Each setting consists of the
following components, in the order given:</p>

<ol>
 <li>A <a lt="WebVTT cue setting name">WebVTT cue setting name</a>.</li>
 <li>An optional U+003A COLON (colon) character.</li>
 <li>An optional <a lt="WebVTT cue setting value">WebVTT cue setting value</a>.</li>
</ol>

<p>A <dfn>WebVTT cue setting name</dfn> and a <dfn>WebVTT cue setting value</dfn> each consist of
any sequence of one or more characters other than U+000A LINE FEED (LF) characters and - U+000D
CARRIAGE RETURN (CR) characters except that the entire resulting string must not contain the
substring "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN
SIGN).</p>

<p>A <dfn>WebVTT percentage</dfn> consists of the following components:</p>

<ol>
 <li>One or more <a>ASCII digits</a>.</li>
 <li>
  Optionally:
  <ol>
   <li>A U+002E DOT character (.).</li>
   <li>One or more <a>ASCII digits</a>.</li>
  </ol>
 </li>
 <li>A U+0025 PERCENT SIGN character (%).</li>
</ol>

<p>When interpreted as a number, a <a>WebVTT percentage</a> must be in the range 0..100.</p>

<p>A <dfn>WebVTT comment block</dfn> consists of the following components, in the given order:</p>

<ol>
 <li>The string "<code>NOTE</code>".</li>
 <li>
  Optionally, the following components, in the given order:
  <ol>
   <li>
    Either:
    <ul>
     <li>A U+0020 SPACE character or U+0009 CHARACTER TABULATION (tab) character.</li>
     <li>A <a>WebVTT line terminator</a>.</li>
    </ul>
   </li>
   <li>Any sequence of zero or more characters other than U+000A LINE FEED (LF) characters and
   U+000D CARRIAGE RETURN (CR) characters, each optionally separated from the next by a <a>WebVTT
   line terminator</a>, except that the entire resulting string must not contain the substring
   "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN).</li>
  </ol>
 </li>
 <li>A <a>WebVTT line terminator</a>.</li>
</ol>

<p class="note">A <a>WebVTT comment block</a> is ignored by the parser.</p>


<h3 id=types-of-webvtt-cue-payload>Types of WebVTT cue payload</h3>


<h4 id=metadata-text>WebVTT metadata text</h4>

<p><dfn>WebVTT metadata text</dfn> consists of any sequence of zero or more characters other than
U+000A LINE FEED (LF) characters and U+000D CARRIAGE RETURN (CR) characters, each optionally
separated from the next by a <a>WebVTT line terminator</a>. (In other words, any text that does not
have two consecutive <a lt="WebVTT line terminator">WebVTT line terminators</a> and does not start
or end with a <a>WebVTT line terminator</a>.)</p>

<p><a>WebVTT metadata text</a> cues are only useful for scripted applications (e.g. using the
<code>metadata</code> <a>text track kind</a> in a HTML <a>text track</a>).</p>


<h4 id=caption-text>WebVTT caption or subtitle cue text</h4>

<p><dfn>WebVTT caption or subtitle cue text</dfn> is <a>cue payload</a> that consists of zero or
more <a>WebVTT caption or subtitle cue components</a>, in any order, each optionally separated from
the next by a <a>WebVTT line terminator</a>.</p>

<p>The <dfn>WebVTT caption or subtitle cue components</dfn> are:</p>

<ul>

 <li>A <a>WebVTT cue class span</a>.</li>
 <li>A <a>WebVTT cue italics span</a>.</li>
 <li>A <a>WebVTT cue bold span</a>.</li>
 <li>A <a>WebVTT cue underline span</a>.</li>
 <li>A <a>WebVTT cue ruby span</a>.</li>
 <li>A <a>WebVTT cue voice span</a>.</li>
 <li>A <a>WebVTT cue language span</a>.</li>

 <li>A <a>WebVTT cue timestamp</a>.</li>

 <li>A <a>WebVTT cue text span</a>, representing the text of the cue.</li>

 <li>An <a lt="character references">HTML character reference</a>, representing one or two Unicode
 code points, as defined in HTML, in the text of the cue. [[!HTML]]</li>

</ul>

<p>All <a>WebVTT caption or subtitle cue components</a> bar the HTML character reference may have
one or more <dfn>cue component class names</dfn> attached to it by separating the cue component
class name from the cue component start tag using the period ('.') notation. The class name must
immediately follow the "period" (.).</p>

<p><dfn>WebVTT cue internal text</dfn> consists of an optional <a>WebVTT line terminator</a>,
followed by zero or more <a>WebVTT caption or subtitle cue components</a>, in any order, each
optionally followed by a <a>WebVTT line terminator</a>.</p>

<p>A <dfn>WebVTT cue class span</dfn> consists of a <a>WebVTT cue span start tag</a>
"<code>c</code>" that disallows an annotation, <a>WebVTT cue internal text</a> representing cue
text, and a <a>WebVTT cue span end tag</a> "<code>c</code>".</p>

<p>A <dfn>WebVTT cue italics span</dfn> consists of a <a>WebVTT cue span start tag</a>
"<code>i</code>" that disallows an annotation, <a>WebVTT cue internal text</a> representing the
italicized text, and a <a>WebVTT cue span end tag</a> "<code>i</code>".</p>

<p>A <dfn>WebVTT cue bold span</dfn> consists of a <a>WebVTT cue span start tag</a> "<code>b</code>"
that disallows an annotation, <a>WebVTT cue internal text</a> representing the boldened text, and a
<a>WebVTT cue span end tag</a> "<code>b</code>".</p>

<p>A <dfn>WebVTT cue underline span</dfn> consists of a <a>WebVTT cue span start tag</a>
"<code>u</code>" that disallows an annotation, <a>WebVTT cue internal text</a> representing the
underlined text, and a <a>WebVTT cue span end tag</a> "<code>u</code>".</p>

<p>A <dfn>WebVTT cue ruby span</dfn> consists of the following components, in the order given:</p>

<ol>
 <li>A <a>WebVTT cue span start tag</a> "<code>ruby</code>" that disallows an annotation.</li>
 <li>
  One or more occurrences of the following group of components, in the order given:
  <ol>
   <li><a>WebVTT cue internal text</a>, representing the ruby base.</li>
   <li>A <a>WebVTT cue span start tag</a> "<code>rt</code>" that disallows an annotation.</li>
   <li>A <dfn>WebVTT cue ruby text span</dfn>: <a>WebVTT cue internal text</a>, representing the
   ruby text component of the ruby annotation.</li>
   <li>A <a>WebVTT cue span end tag</a> "<code>rt</code>". If this is the last occurrence of this
   group of components in the <a>WebVTT cue ruby span</a>, then this last end tag string may be
   omitted.</li>
  </ol>
 </li>
 <li>If the last end tag string was not omitted: Optionally, a <a>WebVTT line terminator</a>.</li>
 <li>If the last end tag string was not omitted: Zero or more U+0020 SPACE characters or U+0009
 CHARACTER TABULATION (tab) characters, each optionally followed by a <a>WebVTT line
 terminator</a>.</li>
 <li>A <a>WebVTT cue span end tag</a> "<code>ruby</code>".</li>
</ol>

<p class="note">Cue positioning controls the positioning of the baseline text, not the ruby
text.</p>

<p class="note">Ruby in WebVTT is a subset of the ruby features in HTML. This might be extended in
the future to also support an object for ruby base text as well as complex ruby, when these features
are more mature in HTML and CSS. [[!HTML]] [[CSS3-RUBY]]</p>

<p>A <dfn>WebVTT cue voice span</dfn> consists of the following components, in the order given:</p>

<ol>
 <li>A <a>WebVTT cue span start tag</a> "<code>v</code>" that requires an annotation; the annotation
 represents the name of the voice.</li>
 <li><a>WebVTT cue internal text</a>.</li>
 <li>A <a>WebVTT cue span end tag</a> "<code>v</code>". If this <a>WebVTT cue voice span</a> is the
 only <a lt="WebVTT caption or subtitle cue components">component</a> of its <a>WebVTT caption or
 subtitle cue text</a> sequence, then the end tag may be omitted for brevity.</li>
</ol>

<p>A <dfn>WebVTT cue language span</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li>A <a>WebVTT cue span start tag</a> "<code>lang</code>" that requires an annotation; the
 annotation represents the language of the following component, and must be a valid BCP 47 language
 tag. [[!BCP47]]</li>
 <li><a>WebVTT cue internal text</a>.</li>
 <li>A <a>WebVTT cue span end tag</a> "<code>lang</code>".</li>
</ol>

<p class=note>The requirement above regarding valid BCP 47 language tag is an authoring requirement,
so a conformance checker will do validity checking of the language tag, but other user agents will
not.</p>

<p>A <dfn>WebVTT cue span start tag</dfn> has a |tag name| and either <!--allows,-->
requires<!--,--> or disallows an annotation, and consists of the following components, in the order
given:</p>

<ol>

 <li>A U+003C LESS-THAN SIGN character (&lt;).</li>

 <li>The |tag name|.</li>

 <li>
  Zero or more occurrences of the following sequence:

  <ol>

   <li>U+002E FULL STOP character (.)</li>

   <li>One or more characters other than U+0009 CHARACTER TABULATION (tab) characters, U+000A LINE
   FEED (LF) characters, U+000D CARRIAGE RETURN (CR) characters, U+0020 SPACE characters, U+0026
   AMPERSAND characters (&amp;), U+003C LESS-THAN SIGN characters (&lt;), U+003E GREATER-THAN SIGN
   characters (>), and U+002E FULL STOP characters (.), representing a class that describes the cue
   span's significance.</li>

  </ol>

 </li>

 <li>
  <p>If the start tag requires an annotation: a U+0020 SPACE character or a U+0009 CHARACTER
  TABULATION (tab) character, followed by one or more of the following components, the concatenation
  of their representations having a value that contains at least one character other than U+0020
  SPACE and U+0009 CHARACTER TABULATION (tab) characters:</p>

  <ul>
   <li><a>WebVTT cue span start tag annotation text</a>, representing the text of the
   annotation.</li>
   <li>An <a lt="character references">HTML character reference</a>, representing one or two Unicode
   code points, as defined in HTML, in the text of the annotation. [[!HTML]]</li>
  </ul>

 </li>

 <li>A U+003E GREATER-THAN SIGN character (>).</li>

</ol>

<p>A <dfn>WebVTT cue span end tag</dfn> has a |tag name| and consists of the following components,
in the order given:</p>

<ol>
 <li>A U+003C LESS-THAN SIGN character (&lt;).</li>
 <li>U+002F SOLIDUS character (/).</li>
 <li>The |tag name|.</li>
 <li>A U+003E GREATER-THAN SIGN character (>).</li>
</ol>

<p>A <dfn>WebVTT cue timestamp</dfn> consists of a U+003C LESS-THAN SIGN character (&lt;), followed
by a <a>WebVTT timestamp</a> representing the time that the given point in the cue becomes active,
followed by a U+003E GREATER-THAN SIGN character (>). The time represented by the <a>WebVTT
timestamp</a> must be greater than the times represented by any previous <a lt="WebVTT cue
timestamp">WebVTT cue timestamps</a> in the cue, as well as greater than the cue's start time
offset, and less than the cue's end time offset.</p>

<p>A <dfn>WebVTT cue text span</dfn> consists of one or more characters other than U+000A LINE FEED
(LF) characters, U+000D CARRIAGE RETURN (CR) characters, U+0026 AMPERSAND characters (&amp;), and
U+003C LESS-THAN SIGN characters (&lt;).</p>

<p><dfn>WebVTT cue span start tag annotation text</dfn> consists of one or more characters other
than U+000A LINE FEED (LF) characters, U+000D CARRIAGE RETURN (CR) characters, U+0026 AMPERSAND
characters (&amp;), and U+003E GREATER-THAN SIGN characters (>).</p>


<h4 id=chapter-title-text>WebVTT chapter title text</h4>

<p><dfn>WebVTT chapter title text</dfn> is <a>cue text</a> that makes use of zero or more of the
following components, each optionally separated from the next by a <a>WebVTT line
terminator</a>:</p>

<ul>
 <li><a>WebVTT cue text span</a></li>
 <li><a lt="character references">HTML character reference</a> [[!HTML]]</li>
</ul>


<h3 id=region-settings>WebVTT region settings</h3>

<p>A <a>WebVTT cue settings list</a> can contain a reference to a <a>WebVTT region</a>. To define a
region, a <a>WebVTT region definition block</a> is specified.</p>

<p>The <dfn>WebVTT region settings list</dfn> consists of zero or more of the following components,
in any order, separated from each other by one or more U+0020 SPACE characters, U+0009 CHARACTER
TABULATION (tab) characters, or <a>WebVTT line terminators</a>, except that the string must not
contain two consecutive <a>WebVTT line terminators</a>. Each component must not be included more
than once per <a>WebVTT region settings list</a> string.</p>

<ul>
 <li>A <a>WebVTT region identifier setting</a>.</li>
 <li>A <a>WebVTT region width setting</a>.</li>
 <li>A <a>WebVTT region lines setting</a>.</li>
 <li>A <a>WebVTT region anchor setting</a>.</li>
 <li>A <a>WebVTT region viewport anchor setting</a>.</li>
 <li>A <a>WebVTT region scroll setting</a>.</li>
</ul>

<p class="note">The <a>WebVTT region settings list</a> gives configuration options regarding the
dimensions, positioning and anchoring of the region. For example, it allows a group of cues within a
region to be anchored in the center of the region and the center of the video viewport. In this
example, when the font size grows, the region grows uniformly in all directions from the center.</p>

<p>A <dfn>WebVTT region identifier setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>id</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>An arbitrary string of one or more characters other than <a>ASCII whitespace</a>. The string
 must not contain the substring "<code>--></code>" (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E
 GREATER-THAN SIGN).</p></li>
</ol>

<p>A <a>WebVTT region identifier setting</a> must be unique amongst all the <a lt="WebVTT region
identifier setting">WebVTT region identifier settings</a> of all <a lt="WebVTT region">WebVTT
regions</a> of a <a>WebVTT file</a>.</p>

<p>A <a>WebVTT region identifier setting</a> must be present in each <a>WebVTT cue settings
list</a>. Without an identifier, it is not possible to associate a <a>WebVTT cue</a> with a
<a>WebVTT region</a> in the syntax.</p>

<p class="note">The <a>WebVTT region identifier setting</a> gives a name to the region so it can be
referenced by the cues that belong to the region.</p>

<p>A <dfn>WebVTT region width setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>width</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>A <a>WebVTT percentage</a>.</p></li>
</ol>

<p class="note">The <a>WebVTT region width setting</a> provides a fixed width as a percentage of the
video width for the region into which cues are rendered and based on which alignment is
calculated.</p>

<p>A <dfn>WebVTT region lines setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>lines</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>One or more <a>ASCII digits</a>.</p></li>
</ol>

<p class="note">The <a>WebVTT region lines setting</a> provides a fixed height as a number of lines
for the region into which cues are rendered. As such, it defines the height of the roll-up region if
it is a scroll region.</p>

<p>A <dfn>WebVTT region anchor setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>regionanchor</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>A <a>WebVTT percentage</a>.</p></li>
 <li><p>A U+002C COMMA character (,).</p></li>
 <li><p>A <a>WebVTT percentage</a>.</p></li>
</ol>

<p class="note">The <a>WebVTT region anchor setting</a> provides a tuple of two percentages that
specify the point within the region box that is fixed in location. The first percentage measures the
x-dimension and the second percentage y-dimension from the top left corner of the region box. If no
<a>WebVTT region anchor setting</a> is given, the anchor defaults to 0%, 100% (i.e. the bottom left
corner).</p>

<p>A <dfn>WebVTT region viewport anchor setting</dfn> consists of the following components, in the
order given:</p>

<ol>
 <li><p>The string "<code>viewportanchor</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>A <a>WebVTT percentage</a>.</p></li>
 <li><p>A U+002C COMMA character (,).</p></li>
 <li><p>A <a>WebVTT percentage</a>.</p></li>
</ol>

<p class="note">The <a>WebVTT region viewport anchor setting</a> provides a tuple of two percentages
that specify the point within the video viewport that the region anchor point is anchored to. The
first percentage measures the x-dimension and the second percentage measures the y-dimension from
the top left corner of the video viewport box. If no region viewport anchor is given, it defaults to
0%, 100% (i.e. the bottom left corner).</p>

<p class="note">For browsers, the region maps to an absolute positioned CSS box relative to the
video viewport, i.e. there is a relative positioned box that represents the video viewport relative
to which the regions are absolutely positioned. Overflow is hidden.</p>

<p>A <dfn>WebVTT region scroll setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>scroll</code>".</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>The string "<code>up</code>".</p></li>
</ol>

<p class="note">The <a>WebVTT region scroll setting</a> specifies whether cues rendered into the
region are allowed to move out of their initial rendering place and roll up, i.e. move towards the
top of the video viewport. If the scroll setting is omitted, cues do not move from their rendered
position.</p>

<p class="note">Cues are added to a region one line at a time below existing cue lines. When an
existing rendered cue line is removed, and it was above another already rendered cue line, that cue
line moves into its space, thus scrolling in the given direction. If there is not enough space for a
new cue line to be added to a region, the top-most cue line is pushed off the visible region (thus
slowly becoming invisible as it moves into overflow:hidden). This eventually makes space for the new
cue line and allows it to be added.</p>

<p class="note">When there is no scroll direction, cue lines are added in the empty line closest to
the line in the bottom of the region. If no empty line is available, the oldest line is
replaced.</p>


<h3 id=cue-settings>WebVTT cue settings</h3>

<p>A <a>WebVTT cue setting</a> is part of a <a>WebVTT cue settings list</a> and provides
configuration options regarding the position and alignment of the cue box and the cue text
within.</p>

<p class="note">For example, a set of WebVTT cue settings may allow a cue box to be aligned to the
left or positioned at the top right with the cue text within center aligned.</p>

<p>The current available <a>WebVTT cue settings</a> that may appear in a <a>WebVTT cue settings
list</a> are:</p>

<ul class="brief">
 <li>A <a>WebVTT vertical text cue setting</a>.</li>
 <li>A <a>WebVTT line cue setting</a>.</li>
 <li>A <a>WebVTT position cue setting</a>.</li>
 <li>A <a>WebVTT size cue setting</a>.</li>
 <li>A <a>WebVTT alignment cue setting</a>.</li>
 <li>A <a>WebVTT region cue setting</a>.</li>
</ul>

<p>Each of these setting must not be included more than once per <a>WebVTT cue settings
list</a>.</p>

<p>A <dfn>WebVTT vertical text cue setting</dfn> is a <a>WebVTT cue setting</a> that consists of the
following components, in the order given:</p>

<ol>
 <li>The string "<code>vertical</code>" as the <a>WebVTT cue setting name</a>.</li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li>One of the following strings as the <a>WebVTT cue setting value</a>: "<code>rl</code>",
 "<code>lr</code>".</li>
</ol>

<p class="note">A <a>WebVTT vertical text cue setting</a> configures the cue to use vertical text
layout rather than horizontal text layout. Vertical text layout is sometimes used in Japanese, for
example. The default is horizontal layout.</p>

<p>A <dfn>WebVTT line cue setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>line</code>" as the <a>WebVTT cue setting name</a>.</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li>
  As the <a>WebVTT cue setting value</a>:
  <ol>
   <li>
    an offset value, either:
    <dl>
     <dt>To represent a specific offset relative to the video viewport</dt>
     <dd>
      <p>A <a>WebVTT percentage</a>.</p>
     </dd>
     <dt>Or to represent a line number</dt>
     <dd>
      <ol>
       <li>Optionally a U+002D HYPHEN-MINUS character (-).</li>
       <li>One or more <a>ASCII digits</a>.</li>
      </ol>
     </dd>
    </dl>
   </li>
   <li>
    An optional alignment value consisting of the following components:
    <ol>
     <li>A U+002C COMMA character (,).</li>
     <li>One of the following strings: "<code>start</code>", "<code>center</code>",
     "<code>end</code>"</li>
    </ol>
   </li>
  </ol>
 </li>
</ol>

<p class="note">A <a>WebVTT line cue setting</a> configures the offset of the cue box from the video
viewport's edge in the direction orthogonal to the <a lt="WebVTT cue writing direction">writing
direction</a>. For horizontal cues, this is the vertical offset from the top of the video viewport,
for vertical cues, it's the horizontal offset. The offset is for the <a lt="WebVTT cue line start
alignment">start</a>, <a lt="WebVTT cue line center alignment">center</a>, or <a lt="WebVTT cue line
end alignment">end</a> of the cue box, depending on the <a>WebVTT cue line alignment</a> value - <a
lt="WebVTT cue line start alignment">start</a> by default. The offset can be given either as a
percentage of the relevant writing-mode dependent video viewport dimension or as a line number. Line
numbers are based on the size of the first line of the cue. Positive line numbers count from the
start of the video viewport (the first line is numbered 0), negative line numbers from the end of
the video viewport (the last line is numbered &#x2212;1).</p>

<p>A <dfn>WebVTT position cue setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>position</code>" as the <a>WebVTT cue setting name</a>.</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li>
  As the <a>WebVTT cue setting value</a>:
  <ol>
   <li>a position value consisting of: a <a>WebVTT percentage</a>.</li>
   <li>
    an optional alignment value consisting of:
    <ol>
     <li>A U+002C COMMA character (,).</li>
     <li>One of the following strings: "<code>line-left</code>", "<code>center</code>",
     "<code>line-right</code>"</li>
    </ol>
   </li>
  </ol>
 </li>
</ol>

<p class="note">A <a>WebVTT position cue setting</a> configures the indent position of the <a
lt="WebVTT cue box">cue box</a> in the direction orthogonal to the <a>WebVTT line cue setting</a>.
For horizontal cues, this is the horizontal position. The cue position is given as a percentage of
the video viewport. The positioning is for the <a lt="WebVTT cue position line-left
alignment">line-left</a>, <a lt="WebVTT cue position center alignment">center</a>, or <a lt="WebVTT
cue position line-right alignment">line-right</a> of the cue box, depending on the cue's <a lt="cue
computed position alignment">computed position alignment</a>, which is overridden by the <a>WebVTT
position cue setting</a>.</p>

<p>A <dfn>WebVTT size cue setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>size</code>" as the <a>WebVTT cue setting name</a>.</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>As the <a>WebVTT cue setting value</a>: a <a>WebVTT percentage</a>.</p></li>
</ol>

<p class="note">A <a>WebVTT size cue setting</a> configures the size of the <a lt="WebVTT cue
box">cue box</a> in the same direction as the <a>WebVTT position cue setting</a>. For horizontal
cues, this is the width of the <a lt="WebVTT cue box">cue box</a>. It is given as a percentage of
the width of the video viewport.</p>

<p>A <dfn>WebVTT alignment cue setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>align</code>" as the <a>WebVTT cue setting name</a>.</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li>One of the following strings as the <a>WebVTT cue setting value</a>: "<code>start</code>",
 "<code>center</code>", "<code>end</code>", "<code>left</code>", "<code>right</code>"</li>
</ol>

<p class="note">A <a>WebVTT alignment cue setting</a> configures the alignment of the text within
the cue. The "<code>start</code>" and "<code>end</code>" keywords are relative to the cue text's
lines' base direction; for left-to-right English text, "<code>start</code>" means left-aligned.</p>

<p>A <dfn>WebVTT region cue setting</dfn> consists of the following components, in the order
given:</p>

<ol>
 <li><p>The string "<code>region</code>" as the <a>WebVTT cue setting name</a>.</p></li>
 <li><p>A U+003A COLON character (:).</p></li>
 <li><p>As the <a>WebVTT cue setting value</a>: a <a>WebVTT region identifier</a>.</p></li>
</ol>

<p>A <a>WebVTT region cue setting</a> configures a cue to become part of a region by referencing the
region's identifier unless the cue has a <a lt="WebVTT vertical text cue setting">"vertical"</a>, <a
lt="WebVTT line cue setting">"line"</a> or <a lt="WebVTT size cue setting">"size"</a> cue setting.
If a cue is part of a region, its cue settings for <a lt="WebVTT position cue
setting">"position"</a> and <a lt="WebVTT alignment cue setting">"align"</a> are applied to the line
boxes in the cue relative to the region box and the cue box width and height are calculated relative
to the region dimensions rather than the viewport dimensions.</p>


<h3 id=properties-of-cue-sequences>Properties of cue sequences</h3>


<h4 id=file-using-only-nested-cues>WebVTT file using only nested cues</h4>

<p>A <a>WebVTT file</a> whose cues all follow the following rules is said to be a <dfn>WebVTT file
using only nested cues</dfn>:</p>

<p>given any two cues |cue1| and |cue2| with start and end time offsets (|x1|, |y1|) and (|x2|,
|y2|) respectively,</p>

<ul>
 <li>either |cue1| lies fully within |cue2|, i.e. |x1| >= |x2| and |y1| &lt;= |y2|</li>
 <li>or |cue1| fully contains |cue2|, i.e. |x1| &lt;= |x2| and |y1| >= |y2|.</li>
</ul>

<div class="example">

 <p>The following example matches this definition:</p>

 <pre>
 WEBVTT

 00:00.000 --> 01:24.000
 Introduction

 00:00.000 --> 00:44.000
 Topics

 00:44.000 --> 01:19.000
 Presenters

 01:24.000 --> 05:00.000
 Scrolling Effects

 01:35.000 --> 03:00.000
 Achim's Demo

 03:00.000 --> 05:00.000
 Timeline Panel
 </pre>

 <p>Notice how you can express the cues in this WebVTT file as a tree structure:</p>

 <ul>
  <li>
   WebVTT file
   <ul>
    <li>
     Introduction
     <ul>
      <li>Topics</li>
      <li>Presenters</li>
     </ul>
    </li>
    <li>
     Scrolling Effects
     <ul>
      <li>Achim's Demo</li>
      <li>Timeline Panel</li>
     </ul>
    </li>
   </ul>
  </li>
 </ul>

 <p>If the file has cues that can't be expressed in this fashion, then they don't match the
 definition of a <a>WebVTT file using only nested cues</a>. For example:</p>

 <pre>
 WEBVTT

 00:00.000 --> 01:00.000
 The First Minute

 00:30.000 --> 01:30.000
 The Final Minute
 </pre>

 <p>In this ninety-second example, the two cues partly overlap, with the first ending before the
 second ends and the second starting before the first ends. This therefore is not a <a>WebVTT file
 using only nested cues</a>.</p>

</div>


<h3 id=types-of-webvtt-files>Types of WebVTT files</h3>

<p>The syntax definition of WebVTT files allows authoring of a wide variety of WebVTT files with a
mix of cues. However, only a small subset of WebVTT file types are typically authored.</p>

<p>Conformance checkers, when validating <a>WebVTT files</a>, may offer to restrict syntax checking
for validating these types.</p>

<h4 id=file-using-metadata-content>WebVTT file using metadata content</h4>

<p>A <a>WebVTT file</a> whose cues all have a <a>cue payload</a> that is <a>WebVTT metadata text</a>
is said to be a <dfn export>WebVTT file using metadata content</dfn>.</p>

<h4 id=file-using-chapter-title-text>WebVTT file using chapter title text</h4>

<p>A <dfn export>WebVTT file using chapter title text</dfn> is a <a>WebVTT file using only nested cues</a>
whose cues all have a <a>cue payload</a> that is <a>WebVTT chapter title text</a>.</p>


<h4 id=file-using-cue-text>WebVTT file using caption or subtitle cue text</h4>

<p>A <a>WebVTT file</a> whose cues all have a <a>cue payload</a> that is <a>WebVTT caption or
subtitle cue text</a> is said to be a <dfn export>WebVTT file using caption or subtitle cue text</dfn>.</p>


<h2 id=default-classes>Default classes for WebVTT Caption or Subtitle Cue Components</h2>

<p>Many captioning formats have simple ways of specifying a limited subset of text colors and
background colors for text. Therefore, the WebVTT spec makes available a set of default <a>cue
component class names</a> for <a>WebVTT caption or subtitle cue components</a> that authors can use
in a standard way to mark up colored text and text background.</p>

<p class=note>User agents that support CSS style sheets may implement this section through adding
User Agent stylesheets.</p>

<h3 id=default-text-color>Default text colors</h3>

<p><a>WebVTT caption or subtitle cue components</a> that have one or more <a lt="cue component class
names">class names</a> matching those in the first cell of a row in the table below must set their
'color' property as <a spec=html>presentational hints</a> to the value in the second cell of the
row:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a lt="cue component class names">class names</a></th>
   <th>'color' value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><code>white</code></td>
   <td>''rgba(255,255,255,1)''</td>
  </tr>
  <tr>
   <td><code>lime</code></td>
   <td>''rgba(0,255,0,1)''</td>
  </tr>
  <tr>
   <td><code>cyan</code></td>
   <td>''rgba(0,255,255,1)''</td>
  </tr>
  <tr>
   <td><code>red</code></td>
   <td>''rgba(255,0,0,1)''</td>
  </tr>
  <tr>
   <td><code>yellow</code></td>
   <td>''rgba(255,255,0,1)''</td>
  </tr>
  <tr>
   <td><code>magenta</code></td>
   <td>''rgba(255,0,255,1)''</td>
  </tr>
  <tr>
   <td><code>blue</code></td>
   <td>''rgba(0,0,255,1)''</td>
  </tr>
  <tr>
   <td><code>black</code></td>
   <td>''rgba(0,0,0,1)''</td>
  </tr>
 </tbody>
</table>

<p class=note>If your background is captioning, don't get confused: The color for the class
<code>lime</code> is what has traditionally been used in captioning under the name ''green'' (e.g.
608/708).</p>

<p class=note>Do not use the classes <code>blue</code> and <code>black</code> on the default dark
background, since they result in unreadable text. In general, please refer to WCAG for guidance on
color contrast [[WCAG20]] and make sure to take into account the text color, background color and
also the video's color.</p>

<h3 id=default-text-background>Default text background colors</h3>

<p><a>WebVTT caption or subtitle cue components</a> that have one or more <a lt="cue component class
names">class names</a> matching those in the first cell of a row in the table below must set their
'background-color' property as <a spec=html>presentational hints</a> to the value in the second cell
of the row:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a lt="cue component class names">class names</a></th>
   <th>'background' value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><code>bg_white</code></td>
   <td>''rgba(255,255,255,1)''</td>
  </tr>
  <tr>
   <td><code>bg_lime</code></td>
   <td>''rgba(0,255,0,1)''</td>
  </tr>
  <tr>
   <td><code>bg_cyan</code></td>
   <td>''rgba(0,255,255,1)''</td>
  </tr>
  <tr>
   <td><code>bg_red</code></td>
   <td>''rgba(255,0,0,1)''</td>
  </tr>
  <tr>
   <td><code>bg_yellow</code></td>
   <td>''rgba(255,255,0,1)''</td>
  </tr>
  <tr>
   <td><code>bg_magenta</code></td>
   <td>''rgba(255,0,255,1)''</td>
  </tr>
  <tr>
   <td><code>bg_blue</code></td>
   <td>''rgba(0,0,255,1)''</td>
  </tr>
  <tr>
   <td><code>bg_black</code></td>
   <td>''rgba(0,0,0,1)''</td>
  </tr>
 </tbody>
</table>

<p class=note>The color for the class <code>bg_lime</code> is what has traditionally been used in
captioning under the name ''green'' (e.g. 608/708).</p>

<p>For the purpose of determining the <a spec=css-cascade>cascade</a> of the color and background
classes, the order of appearance determines the cascade of the classes.</p>

<div class="example">

 <p>This example shows how to use the classes.</p>

 <pre>
 WEBVTT

 02:00.000 --> 02:05.000
 &lt;c.yellow.bg_blue>This is yellow text on a blue background&lt;/c>

 04:00.000 --> 04:05.000
 &lt;c.yellow.bg_blue.magenta.bg_black>This is magenta text on a black background&lt;/c>
 </pre>

</div>

<p class=note>Default classes can be changed by authors, e.g. ::cue(.yellow) {color:cyan} would
change all .yellow classed text to cyan.</p>


<h2 id=parsing>Parsing</h2>

<p>WebVTT file parsing is the same for all types of WebVTT files, including captions, subtitles,
chapters, or metadata. Most of the steps will be skipped for chapters or metadata files.</p>


<h3 id=file-parsing algorithm>WebVTT file parsing</h3>

<p>A <dfn>WebVTT parser</dfn>, given an input byte stream, a <a>text track list of cues</a>
|output|, and a collection of <a spec=cssom>CSS style sheets</a> |stylesheets|, must decode the byte
stream using the <a lt="UTF-8 decode">UTF-8 decode</a> algorithm, and then must parse the resulting
string according to the <a>WebVTT parser algorithm</a> below. This results in <a>WebVTT cues</a>
being added to |output|, and <a spec=cssom>CSS style sheets</a> being added to |stylesheets|.
[[!RFC3629]]</p>

<p>A <a>WebVTT parser</a>, specifically its conversion and parsing steps, is typically run
asynchronously, with the input byte stream being updated incrementally as the resource is
downloaded; this is called an <dfn>incremental WebVTT parser</dfn>.</p>

<p>A <a>WebVTT parser</a> verifies a file signature before parsing the provided byte stream. If the
stream lacks this WebVTT file signature, then the parser aborts.</p>

<p>The <dfn>WebVTT parser algorithm</dfn> is as follows:</p>

<ol algorithm="WebVTT parser algorithm">

 <li>
  <p>Let |input| be the string being parsed, after conversion to Unicode, and with the following
  transformations applied:</p>

  <ul>

   <li><p>Replace all U+0000 NULL characters by U+FFFD REPLACEMENT CHARACTERs.</p></li>

   <li><p>Replace each U+000D CARRIAGE RETURN U+000A LINE FEED (CRLF) character pair by a single
   U+000A LINE FEED (LF) character.</p></li>

   <li><p>Replace all remaining U+000D CARRIAGE RETURN characters by U+000A LINE FEED (LF)
   characters.</p></li>

  </ul>

 </li>

 <li><p>Let |position| be a pointer into |input|, initially pointing at the start of the string. In
 an <a>incremental WebVTT parser</a>, when this algorithm (or further algorithms that it uses) moves
 the |position| pointer, the user agent must wait until appropriate further characters from the byte
 stream have been added to |input| before moving the pointer, so that the algorithm never reads past
 the end of the |input| string. Once the byte stream has ended, and all characters have been added
 to |input|, then the |position| pointer may, when so instructed by the algorithms, be moved past
 the end of |input|.</p></li>

 <li>Let |seen cue| be false.</li>

 <!-- SIGNATURE CHECK -->

 <li><p>If |input| is less than six characters long, then abort these steps. The file does not start
 with the correct <a>WebVTT file</a> signature and was therefore not successfully
 processed.</p></li>

 <li><p>If |input| is exactly six characters long but does not exactly equal "<code>WEBVTT</code>",
 then abort these steps. The file does not start with the correct <a>WebVTT file</a> signature and
 was therefore not successfully processed.</p></li>

 <li><p>If |input| is more than six characters long but the first six characters do not exactly
 equal "<code>WEBVTT</code>", or the seventh character is not a U+0020 SPACE character, a U+0009
 CHARACTER TABULATION (tab) character, or a U+000A LINE FEED (LF) character, then abort these steps.
 The file does not start with the correct <a>WebVTT file</a> signature and was therefore not
 successfully processed.</p></li>

 <li><p><a>collect a sequence of code points</a> that are <em>not</em> U+000A LINE FEED (LF)
 characters.</p></li>

 <li><p>If |position| is past the end of |input|, then abort these steps. The file was successfully
 processed, but it contains no useful data and so no <a lt="WebVTT cue">WebVTT cues</a> were added
 to |output|.</p></li>

 <li><p>The character indicated by |position| is a U+000A LINE FEED (LF) character. Advance
 |position| to the next character in |input|.</p></li>

 <li><p>If |position| is past the end of |input|, then abort these steps. The file was successfully
 processed, but it contains no useful data and so no <a lt="WebVTT cue">WebVTT cues</a> were added
 to |output|.</p></li>

 <li><p><i>Header</i>: If the character indicated by |position| is not a U+000A LINE FEED (LF)
 character, then <a>collect a WebVTT block</a> with the <i>in header</i> flag set. Otherwise,
 advance |position| to the next character in |input|.</p></li>

 <li><p><a>collect a sequence of code points</a> that are U+000A LINE FEED (LF) characters.</p></li>

 <li><p>Let |regions| be an empty <a>text track list of regions</a>.</p></li>

 <li>

  <p><i>Block loop</i>: While |position| doesn't point past the end of |input|:</p>

  <ol>

   <li><p><a>Collect a WebVTT block</a>, and let |block| be the returned value.</p></li>

   <li><p>If |block| is a <a>WebVTT cue</a>, add |block| to the <a>text track list of cues</a>
   |output|.</p></li>

   <li><p>Otherwise, if |block| is a <a spec=cssom>CSS style sheet</a>, add |block| to
   |stylesheets|.</p></li>

   <li><p>Otherwise, if |block| is a <a>WebVTT region object</a>, add |block| to |regions|.</p></li>

   <!-- handle new block types here -->

   <li><p><a>collect a sequence of code points</a> that are U+000A LINE FEED (LF)
   characters.</p></li>

  </ol>

 </li>

 <li><p><i>End</i>: The file has ended. Abort these steps. The <a>WebVTT parser</a> has finished.
 The file was successfully processed.</p></li>

</ol>

<p>When the algorithm above says to <dfn>collect a WebVTT block</dfn>, optionally with a flag <i>in
header</i> set, the user agent must run the following steps:</p>

<ol algorithm="collect a WebVTT block">

 <li><p>Let |input|, |position|, |seen cue| and |regions| be the same variables as those of the same
 name in the algorithm that invoked these steps.</p></li>

 <li><p>Let |line count| be zero.</p></li>

 <li><p>Let |previous position| be |position|.</p></li>

 <li><p>Let |line| be the empty string.</p></li>

 <li><p>Let |buffer| be the empty string.</p></li>

 <li><p>Let |seen EOF| be false.</p></li>

 <li><p>Let |seen arrow| be false.</p></li>

 <li><p>Let |cue| be null.</p></li>

 <li><p>Let |stylesheet| be null.</p></li>

 <li><p>Let |region| be null.</p></li>

 <li>

  <p><i>Loop</i>: Run these substeps in a loop:</p>

  <ol>

   <li><p><a>collect a sequence of code points</a> that are <em>not</em> U+000A LINE FEED (LF)
   characters. Let |line| be those characters, if any.</p></li>

   <li><p>Increment |line count| by 1.</p></li>

   <li><p>If |position| is past the end of |input|, let |seen EOF| be true. Otherwise, the character
   indicated by |position| is a U+000A LINE FEED (LF) character; advance |position| to the next
   character in |input|.</p></li>

   <li>

    <p>If |line| contains the three-character substring "<code>--></code>" (U+002D HYPHEN-MINUS,
    U+002D HYPHEN-MINUS, U+003E GREATER-THAN SIGN), then run these substeps:</p>

    <ol>

     <li>

      <p>If <i>in header</i> is not set and at least one of the following conditions are true:</p>

      <ul>

       <li><p>|line count| is 1</p></li>

       <li><p>|line count| is 2 and |seen arrow| is false</p></li>

      </ul>

      <p>...then run these substeps:</p>

      <ol>

       <li><p>Let |seen arrow| be true.</p></li>

       <li><p>Let |previous position| be |position|.</p></li>

       <li>

        <p><i>Cue creation</i>: Let |cue| be a new <a>WebVTT cue</a> and initialize it as
        follows:</p>

        <ol>

         <li><p>Let |cue|'s <a>text track cue identifier</a> be |buffer|.</p></li>

         <li><p>Let |cue|'s <a>text track cue pause-on-exit flag</a> be false.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue region</a> be null.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue writing direction</a> be <a lt="WebVTT cue horizontal
         writing direction">horizontal</a>.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue snap-to-lines flag</a> be true.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue line</a> be <a lt="WebVTT cue line
         automatic">auto</a>.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue line alignment</a> be <a lt="WebVTT cue line start
         alignment">start alignment</a>.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue position</a> be <a lt="WebVTT cue automatic
         position">auto</a>.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue position alignment</a> be <a lt="WebVTT cue position
         automatic alignment">auto</a>.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue size</a> be 100.</p></li>

         <li><p>Let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue center
         alignment">center alignment</a>.</p></li>

         <li><p>Let |cue|'s <a>cue text</a> be the empty string.</p></li>

        </ol>

       </li>

       <li><p><a>Collect WebVTT cue timings and settings</a> from |line| using |regions| for |cue|.
       If that fails, let |cue| be null. Otherwise, let |buffer| be the empty string and let |seen
       cue| be true.</p></li>

      </ol>

      <p>Otherwise, let |position| be |previous position| and break out of <i>loop</i>.</p>

     </li>

    </ol>

   </li>

   <li><p>Otherwise, if |line| is the empty string, break out of <i>loop</i>.</p></li>

   <li>

    <p>Otherwise, run these substeps:</p>

    <ol>

     <li>

      <p>If <i>in header</i> is not set and |line count| is 2, run these substeps:</p>

      <ol>

       <li>

        <p>If |seen cue| is false and |buffer| starts with the substring "<code>STYLE</code>"
        (U+0053 LATIN CAPITAL LETTER S, U+0054 LATIN CAPITAL LETTER T, U+0059 LATIN CAPITAL LETTER
        Y, U+004C LATIN CAPITAL LETTER L, U+0045 LATIN CAPITAL LETTER E), and the remaining
        characters in |buffer| (if any) are all <a>ASCII whitespace</a>, then run these
        substeps:</p>

        <ol>
         <li>
          <p>Let |stylesheet| be the result of <a spec=cssom lt="create a CSS style sheet">creating
          a CSS style sheet</a>, with the following properties: [[!CSSOM]]</p>
          <dl>
           <dt><a spec=cssom for=CSSStyleSheet>location</a></dt>
           <dd>null</dd>
           <dt><a spec=cssom for=CSSStyleSheet>parent CSS style sheet</a></dt>
           <dd>null</dd>
           <dt><a spec=cssom for=CSSStyleSheet>owner node</a></dt>
           <dd>null</dd>
           <dt><a spec=cssom for=CSSStyleSheet>owner CSS rule</a></dt>
           <dd>null</dd>
           <dt><a spec=cssom for=CSSStyleSheet>media</a></dt>
           <dd>The empty string.</dd>
           <dt><a spec=cssom for=CSSStyleSheet>title</a></dt>
           <dd>The empty string.</dd>
           <dt><a spec=cssom for=CSSStyleSheet>alternate flag</a></dt>
           <dd>Unset.</dd>
           <dt><a spec=cssom for=CSSStyleSheet>origin-clean flag</a></dt>
           <dd>Set.</dd>
          </dl>
         </li>

         <li><p>Let |buffer| be the empty string.</p></li>
        </ol>

       </li>

       <li>

        <p>Otherwise, if |seen cue| is false and |buffer| starts with the substring
        "<code>REGION</code>" (U+0052 LATIN CAPITAL LETTER R, U+0045 LATIN CAPITAL LETTER E, U+0047
        LATIN CAPITAL LETTER G, U+0049 LATIN CAPITAL LETTER I, U+004F LATIN CAPITAL LETTER O, U+004E
        LATIN CAPITAL LETTER N), and the remaining characters in |buffer| (if any) are all <a>ASCII
        whitespace</a>, then run these substeps:</p>

        <ol>

         <li><p><i>Region creation</i>: Let |region| be a new <a>WebVTT region</a>.</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region identifier">identifier</a> be the empty
         string.</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region width">width</a> be 100.</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region lines">lines</a> be 3.</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region anchor">anchor point</a> be (0,100).</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region viewport anchor">viewport anchor point</a> be
         (0,100).</p></li>

         <li><p>Let |region|'s <a lt="WebVTT region scroll">scroll value</a> be <a lt="WebVTT region
         scroll none">none</a>.</p></li>

         <li><p>Let |buffer| be the empty string.</p></li>

        </ol>

       </li>

       <!-- <li><p>Otherwise, (check for new block types here)</p></li> -->

      </ol>

     </li>

     <li><p>If |buffer| is not the empty string, append a U+000A LINE FEED (LF) character to
     |buffer|.</p></li>

     <li><p>Append |line| to |buffer|.</p></li>

     <li><p>Let |previous position| be |position|.</p></li>

    </ol>

   </li>

   <li><p>If |seen EOF| is true, break out of <i>loop</i>.</p></li>

  </ol>

 </li>

 <li><p>If |cue| is not null, let the <a>cue text</a> of |cue| be |buffer|, and return
 |cue|.</p></li>

 <li><p>Otherwise, if |stylesheet| is not null, then <a spec=css-syntax>Parse a stylesheet</a> from
 |buffer|. If it returned a list of rules, assign the list as |stylesheet|'s <a spec=cssom
 for=CSSStyleSheet>CSS rules</a>; otherwise, set |stylesheet|'s <a spec=cssom for=CSSStyleSheet>CSS
 rules</a> to an empty list. [[!CSSOM]] [[!CSS-SYNTAX-3]] Finally, return |stylesheet|.</p></li>

 <li><p>Otherwise, if |region| is not null, then <a>collect WebVTT region settings</a> from |buffer|
 using |region| for the results. Construct a <a>WebVTT Region Object</a> from |region|, and return
 it.</p></li>

 <!-- return new block types here -->

 <li><p>Otherwise, return null.</p></li>

</ol>


<h3 id=region-settings-parsing algorithm>WebVTT region settings parsing</h3>

<p>When the <a>WebVTT parser algorithm</a> says to <dfn>collect WebVTT region settings</dfn> from a
string |input| for a <a>text track</a>, the user agent must run the following algorithm.</p>

<p>A <dfn>WebVTT region object</dfn> is a conceptual construct to represent a <a>WebVTT region</a>
that is used as a root node for <a lt="List of WebVTT node objects">lists of WebVTT node
objects</a>. This algorithm returns a list of <a lt="WebVTT region object">WebVTT Region
Objects</a>.</p>

<ol algorithm="WebVTT region objects">
 <li><p>Let |settings| be the result of <a lt="split a string on spaces">splitting |input| on
 spaces</a>.</p></li>

 <li>
  For each token |setting| in the list |settings|, run the following substeps:

  <ol>
   <li><p>If |setting| does not contain a U+003A COLON character (:), or if the first U+003A COLON
   character (:) in |setting| is either the first or last character of |setting|, then jump to the
   step labeled <i>next setting</i>.</p></li>

   <li><p>Let |name| be the leading substring of |setting| up to and excluding the first U+003A
   COLON character (:) in that string.</p></li>

   <li><p>Let |value| be the trailing substring of |setting| starting from the character immediately
   after the first U+003A COLON character (:) in that string.</p></li>

   <li>
    <p>Run the appropriate substeps that apply for the value of |name|, as follows:</p>

    <dl>
     <dt><p>If |name| is a <a>case-sensitive</a> match for "<code>id</code>"</p></dt>
     <dd><p>Let |region|'s <a lt="WebVTT region identifier">identifier</a> be |value|.</p></dd>

     <dt><p>Otherwise if |name| is a <a>case-sensitive</a> match for "<code>width</code>"</p></dt>
     <dd><p>If <a>parse a percentage string</a> from |value| returns a |percentage|, let |region|'s
     <a>WebVTT region width</a> be |percentage|.</p></dd>

     <dt>Otherwise if |name| is a <a>case-sensitive</a> match for "<code>lines</code>"</dt>
     <dd>
      <ol>
       <li><p>If |value| contains any characters other than <a>ASCII digits</a>, then jump to the
       step labeled <i>next setting</i>.</p></li>

       <li><p>Interpret |value| as an integer, and let |number| be that number.</p></li>

       <li><p>Let |region|'s <a>WebVTT region lines</a> be |number|.</p></li>
      </ol>
     </dd>

     <dt>Otherwise if |name| is a <a>case-sensitive</a> match for "<code>regionanchor</code>"</dt>
     <dd>
      <ol>
       <li><p>If |value| does not contain a U+002C COMMA character (,), then jump to the step
       labeled <i>next setting</i>.</p></li>

       <li><p>Let |anchorX| be the leading substring of |value| up to and excluding the first U+002C
       COMMA character (,) in that string.</p></li>

       <li><p>Let |anchorY| be the trailing substring of |value| starting from the character
       immediately after the first U+002C COMMA character (,) in that string.</p></li>

       <li><p>If <a>parse a percentage string</a> from |anchorX| or <a>parse a percentage string</a>
       from |anchorY| don't return a |percentage|, then jump to the step labeled <i>next
       setting</i>.</p></li>

       <li><p>Let |region|'s <a lt="WebVTT region anchor">WebVTT region anchor point</a> be the
       tuple of the |percentage| values calculated from |anchorX| and |anchorY|.</p></li>
      </ol>
     </dd>

     <dt>Otherwise if |name| is a <a>case-sensitive</a> match for "<code>viewportanchor</code>"</dt>
     <dd>
      <ol>
       <li><p>If |value| does not contain a U+002C COMMA character (,), then jump to the step
       labeled <i>next setting</i>.</p></li>

       <li><p>Let |viewportanchorX| be the leading substring of |value| up to and excluding the
       first U+002C COMMA character (,) in that string.</p></li>

       <li><p>Let |viewportanchorY| be the trailing substring of |value| starting from the character
       immediately after the first U+002C COMMA character (,) in that string.</p></li>

       <li><p>If <a>parse a percentage string</a> from |viewportanchorX| or <a>parse a percentage
       string</a> from |viewportanchorY| don't return a |percentage|, then jump to the step labeled
       <i>next setting</i>.</p></li>

       <li><p>Let |region|'s <a lt="WebVTT region viewport anchor">WebVTT region viewport anchor
       point</a> be the tuple of the |percentage| values calculated from |viewportanchorX| and
       |viewportanchorY|.</p></li>
      </ol>
     </dd>

     <dt>Otherwise if |name| is a <a>case-sensitive</a> match for "<code>scroll</code>"</dt>
     <dd>
      <ol>
       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>up</code>", then let
       |region|'s <a lt="WebVTT region scroll">scroll value</a> be <a lt="WebVTT region scroll
       up">up</a>.</p></li>
      </ol>
     </dd>
    </dl>
   </li>

   <li><i>Next setting</i>: Continue to the next setting, if any.</li>
  </ol>
 </li>
</ol>

<p>The rules to <dfn>parse a percentage string</dfn> are as follows. This will return either a
number in the range 0..100, or nothing. If at any point the algorithm says that it "fails", this
means that it is aborted at that point and returns nothing.</p>

<ol algorithm="parse a percentage string">
 <li><p>Let |input| be the string being parsed.</p></li>

 <li><p>If |input| does not match the syntax for a <a>WebVTT percentage</a>, then fail.</p></li>

 <li><p>Remove the last character from |input|.</p></li>

 <li><p>Let |percentage| be the result of parsing |input| using the <a>rules for parsing
 floating-point number values</a>. [[!HTML]]</p></li>

 <li><p>If |percentage| is an error, is less than 0, or is greater than 100, then fail.</p></li>

 <li><p>Return |percentage|.</p></li>
</ol>


<h3 id=cue-timings-and-settings-parsing algorithm>WebVTT cue timings and settings parsing</h3>

<p>When the algorithm above says to <dfn>collect WebVTT cue timings and settings</dfn> from a string
|input| using a <a>text track list of regions</a> |regions| for a <a>WebVTT cue</a> |cue|, the user
agent must run the following algorithm.</p>

<ol algorithm="collect WebVTT cue timings and settings">

 <li><p>Let |input| be the string being parsed.</p></li>

 <li><p>Let |position| be a pointer into |input|, initially pointing at the start of the
 string.</p></li>

 <li><p><a>Skip whitespace</a>.</p></li>

 <li><p><a>Collect a WebVTT timestamp</a>. If that algorithm fails, then abort these steps and
 return failure. Otherwise, let |cue|'s <a>text track cue start time</a> be the collected
 time.</p></li>

 <li><p><a>Skip whitespace</a>.</p></li>

 <!-- we can't be beyond the end of the string until we've seen the arrow, since we know the arrow
 is in the string and nothing we've done so far would move us past the first "-". -->

 <li><p>If <!--|position| is beyond the end of |input| or if--> the character at |position| is not a
 U+002D HYPHEN-MINUS character (-) then abort these steps and return failure. Otherwise, move
 |position| forwards one character.</p></li>

 <li><p>If <!--|position| is beyond the end of |input| or if--> the character at |position| is not a
 U+002D HYPHEN-MINUS character (-) then abort these steps and return failure. Otherwise, move
 |position| forwards one character.</p></li>

 <li><p>If <!--|position| is beyond the end of |input| or if--> the character at |position| is not a
 U+003E GREATER-THAN SIGN character (>) then abort these steps and return failure. Otherwise, move
 |position| forwards one character.</p></li>

 <li><p><a>Skip whitespace</a>.</p></li>

 <li><p><a>Collect a WebVTT timestamp</a>. If that algorithm fails, then abort these steps and
 return failure. Otherwise, let |cue|'s <a>text track cue end time</a> be the collected
 time.</p></li>

 <li><p>Let |remainder| be the trailing substring of |input| starting at |position|.</p></li>

 <li><p><a>Parse the WebVTT cue settings</a> from |remainder| using |regions| for |cue|.</p></li>

</ol>

<p>When the user agent is to <dfn>parse the WebVTT cue settings</dfn> from a string |input| using a
<a>text track list of regions</a> |regions| for a <a>text track cue</a> |cue|, the user agent must
run the following steps:</p>

<ol algorithm="parse the WebVTT cue settings">

 <li><p>Let |settings| be the result of <a lt="split a string on spaces">splitting |input| on
 spaces</a>.</p></li>

 <li>

  <p>For each token |setting| in the list |settings|, run the following substeps:</p>

  <ol>

   <li><p>If |setting| does not contain a U+003A COLON character (:), or if the first U+003A COLON
   character (:) in |setting| is either the first or last character of |setting|, then jump to the
   step labeled <i>next setting</i>.</p></li>

   <li><p>Let |name| be the leading substring of |setting| up to and excluding the first U+003A
   COLON character (:) in that string.</p></li>

   <li><p>Let |value| be the trailing substring of |setting| starting from the character immediately
   after the first U+003A COLON character (:) in that string.</p></li>

   <li>

    <p>Run the appropriate substeps that apply for the value of |name|, as follows:</p>

    <dl>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>region</code>"</dt>

     <dd>
      <ol>
       <li><p>Let |cue|'s <a>WebVTT cue region</a> be the last <a>WebVTT region</a> in |regions|
       whose <a>WebVTT region identifier</a> is |value|, if any, or null otherwise.</p></li>
      </ol>
     </dd>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>vertical</code>"</dt>

     <dd>

      <ol>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>rl</code>", then let
       |cue|'s <a>WebVTT cue writing direction</a> be <a lt="WebVTT cue vertical growing left
       writing direction">vertical growing left</a>.</p></li>

       <li><p>Otherwise, if |value| is a <a>case-sensitive</a> match for the string
       "<code>lr</code>", then let |cue|'s <a>WebVTT cue writing direction</a> be <a lt="WebVTT cue
       vertical growing right writing direction">vertical growing right</a>.</p></li>

       <li><p>If |cue|'s <a>WebVTT cue writing direction</a> is not <a lt="WebVTT cue horizontal
       writing direction">horizontal</a>, let |cue|'s <a>WebVTT cue region</a> be null (there are no
       vertical regions).</p></li>

      </ol>

     </dd>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>line</code>"</dt>

     <dd>

      <ol>

       <li><p>If |value| contains a U+002C COMMA character (,), then let |linepos| be the leading
       substring of |value| up to and excluding the first U+002C COMMA character (,) in that string
       and let |linealign| be the trailing substring of |value| starting from the character
       immediately after the first U+002C COMMA character (,) in that string.</p></li>

       <li><p>Otherwise let |linepos| be the full |value| string and |linealign| be null.</p></li>

       <li><p>If |linepos| does not contain at least one <a>ASCII digit</a>, then jump to the step
       labeled <i>next setting</i>.</p></li>

       <li>
        <dl>
         <dt><p>If the last character in |linepos| is a U+0025 PERCENT SIGN character (%)</p></dt>

         <dd><p>If <a>parse a percentage string</a> from |linepos| doesn't fail, let |number| be the
         returned |percentage|, otherwise jump to the step labeled <i>next setting</i>.</p></dd>

         <dt><p>Otherwise</p></dt>

         <dd>
          <ol>
           <li><p>If |linepos| contains any characters other than U+002D HYPHEN-MINUS characters
           (-), <a>ASCII digits</a>, and U+002E DOT character (.), then jump to the step labeled
           <i>next setting</i>.</p></li>

           <li><p>If any character in |linepos| other than the first character is a U+002D
           HYPHEN-MINUS character (-), then jump to the step labeled <i>next setting</i>.</p></li>

           <li><p>If there are more than one U+002E DOT characters (.), then jump to the step
           labeled <i>next setting</i>.</p></li>

           <li><p>If there is a U+002E DOT character (.) and the character before or the character
           after is not an <a>ASCII digit</a>, or if the U+002E DOT character (.) is the first or
           the last character, then jump to the step labeled <i>next setting</i>.</p></li>

           <li><p>Let |number| be the result of parsing |linepos| using the <a>rules for parsing
           floating-point number values</a>. [[!HTML]]</p></li>

           <li><p>If |number| is an error, then jump to the step labeled <i>next
           setting</i>.</p></li>
          </ol>
         </dd>
        </dl>
       </li>

       <li><p>If |linealign| is a <a>case-sensitive</a> match for the string "<code>start</code>",
       then let |cue|'s <a>WebVTT cue line alignment</a> be <a lt="WebVTT cue line start
       alignment">start alignment</a>.</p></li>

       <li><p>Otherwise, if |linealign| is a <a>case-sensitive</a> match for the string
       "<code>center</code>", then let |cue|'s <a>WebVTT cue line alignment</a> be <a lt="WebVTT cue
       line center alignment">center alignment</a>.</p></li>

       <li><p>Otherwise, if |linealign| is a <a>case-sensitive</a> match for the string
       "<code>end</code>", then let |cue|'s <a>WebVTT cue line alignment</a> be <a lt="WebVTT cue
       line end alignment">end alignment</a>.</p></li>

       <li><p>Otherwise, if |linealign| is not null, then jump to the step labeled <i>next
       setting</i>.</p></li>

       <li><p>Let |cue|'s <a>WebVTT cue line</a> be |number|.</p></li>

       <li><p>If the last character in |linepos| is a U+0025 PERCENT SIGN character (%), then let
       |cue|'s <a>WebVTT cue snap-to-lines flag</a> be false. Otherwise, let it be true.</p></li>

       <li><p>If |cue|'s <a>WebVTT cue line</a> is not <a lt="WebVTT cue line automatic">auto</a>,
       let |cue|'s <a>WebVTT cue region</a> be null (the cue has been explicitly positioned with a
       line offset and thus drops out of the region).</p></li>

      </ol>

     </dd>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>position</code>"</dt>

     <dd>

      <ol>

       <li><p>If |value| contains a U+002C COMMA character (,), then let |colpos| be the leading
       substring of |value| up to and excluding the first U+002C COMMA character (,) in that string
       and let |colalign| be the trailing substring of |value| starting from the character
       immediately after the first U+002C COMMA character (,) in that string.</p></li>

       <li><p>Otherwise let |colpos| be the full |value| string and |colalign| be null.</p></li>

       <li><p>If <a>parse a percentage string</a> from |colpos| doesn't fail, let |number| be the
       returned |percentage|, otherwise jump to the step labeled <i>next setting</i> (<a lt="WebVTT
       cue position">position</a>'s value remains the special value <a lt="WebVTT cue automatic
       position">auto</a>).</p></li>

       <li><p>If |colalign| is a <a>case-sensitive</a> match for the string
       "<code>line-left</code>", then let |cue|'s <a>WebVTT cue position alignment</a> be <a
       lt="WebVTT cue position line-left alignment">line-left alignment</a>.</p></li>

       <li><p>Otherwise, if |colalign| is a <a>case-sensitive</a> match for the string
       "<code>center</code>", then let |cue|'s <a>WebVTT cue position alignment</a> be <a lt="WebVTT
       cue position center alignment">center alignment</a>.</p></li>

       <li><p>Otherwise, if |colalign| is a <a>case-sensitive</a> match for the string
       "<code>line-right</code>", then let |cue|'s <a>WebVTT cue position alignment</a> be <a
       lt="WebVTT cue position line-right alignment">line-right alignment</a>.</p></li>

       <li><p>Otherwise, if |colalign| is not null, then jump to the step labeled <i>next
       setting</i>.</p></li>

       <li><p>Let |cue|'s <a lt="WebVTT cue position">position</a> be |number|.</p></li>

      </ol>

     </dd>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>size</code>"</dt>

     <dd>

      <ol>

       <li><p>If <a>parse a percentage string</a> from |value| doesn't fail, let |number| be the
       returned |percentage|, otherwise jump to the step labeled <i>next setting</i>.</p></li>

       <li><p>Let |cue|'s <a>WebVTT cue size</a> be |number|.</p></li>

       <li><p>If |cue|'s <a>WebVTT cue size</a> is not 100, let |cue|'s <a>WebVTT cue region</a> be
       null (the cue has been explicitly sized and thus drops out of the region).</p></li>

      </ol>

     </dd>

     <dt>If |name| is a <a>case-sensitive</a> match for "<code>align</code>"</dt>

     <dd>

      <ol>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>start</code>", then
       let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue start alignment">start
       alignment</a>.</p></li>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>center</code>", then
       let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue center alignment">center
       alignment</a>.</p></li>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>end</code>", then
       let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue end alignment">end
       alignment</a>.</p></li>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>left</code>", then
       let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue left alignment">left
       alignment</a>.</p></li>

       <li><p>If |value| is a <a>case-sensitive</a> match for the string "<code>right</code>", then
       let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue right alignment">right
       alignment</a>.</p></li>

      </ol>

     </dd>

    </dl>

   </li>

   <li><p><i>Next setting</i>: Continue to the next token, if any.</p></li> <!-- this step is just
   here to give the algorithms above a clean way to 'break' -->

  </ol>

 </li>

</ol>

<p>When this specification says that a user agent is to <dfn>collect a WebVTT timestamp</dfn>, the
user agent must run the following steps:</p>

<ol algorithm="collect a WebVTT timestamp">

 <li><p>Let |input| and |position| be the same variables as those of the same name in the algorithm
 that invoked these steps.</p></li>

 <li><p>Let |most significant units| be <i>minutes</i>.</p></li>

 <li><p>If |position| is past the end of |input|, return an error and abort these steps.</p></li>

 <li><p>If the character indicated by |position| is not an <a>ASCII digit</a>, then return an error
 and abort these steps.</p></li>

 <li><p><a>Collect a sequence of code points</a> that are <a>ASCII digits</a>, and let |string| be
 the collected substring.</p></li>

 <li><p>Interpret |string| as a base-ten integer. Let <var>value<sub>1</sub></var> be that
 integer.</p></li>

 <li><p>If |string| is not exactly two characters in length, or if <var>value<sub>1</sub></var> is
 greater than 59, let |most significant units| be <i>hours</i>.</p></li>

 <li><p>If |position| is beyond the end of |input| or if the character at |position| is not a U+003A
 COLON character (:), then return an error and abort these steps. Otherwise, move |position|
 forwards one character.</p></li>

 <li><p><a>collect a sequence of code points</a> that are <a>ASCII digits</a>, and let |string| be
 the collected substring.</p></li>

 <li><p>If |string| is not exactly two characters in length, return an error and abort these
 steps.</p></li>

 <li><p>Interpret |string| as a base-ten integer. Let <var>value<sub>2</sub></var> be that
 integer.</p></li>

 <li>

  <p>If |most significant units| is <i>hours</i>, or if |position| is not beyond the end of |input|
  and the character at |position| is a U+003A COLON character (:), run these substeps:</p>

  <ol>

   <li><p>If |position| is beyond the end of |input| or if the character at |position| is not a
   U+003A COLON character (:), then return an error and abort these steps. Otherwise, move
   |position| forwards one character.</p></li>

   <li><p><a>collect a sequence of code points</a> that are <a>ASCII digits</a>, and let |string| be
   the collected substring.</p></li>

   <li><p>If |string| is not exactly two characters in length, return an error and abort these
   steps.</p></li>

   <li><p>Interpret |string| as a base-ten integer. Let <var>value<sub>3</sub></var> be that
   integer.</p></li>

  </ol>

  <p>Otherwise (if |most significant units| is not <i>hours</i>, and either |position| is beyond the
  end of |input|, or the character at |position| is not a U+003A COLON character (:)), let
  <var>value<sub>3</sub></var> have the value of <var>value<sub>2</sub></var>, then
  <var>value<sub>2</sub></var> have the value of <var>value<sub>1</sub></var>, then let
  <var>value<sub>1</sub></var> equal zero.</p>

 </li>

 <li><p>If |position| is beyond the end of |input| or if the character at |position| is not a U+002E
 FULL STOP character (.), then return an error and abort these steps. Otherwise, move |position|
 forwards one character.</p></li>

 <li><p><a>collect a sequence of code points</a> that are <a>ASCII digits</a>, and let |string| be
 the collected substring.</p></li>

 <li><p>If |string| is not exactly three characters in length, return an error and abort these
 steps.</p></li>

 <li><p>Interpret |string| as a base-ten integer. Let <var>value<sub>4</sub></var> be that
 integer.</p></li>

 <li><p>If <var>value<sub>2</sub></var> is greater than 59 or if <var>value<sub>3</sub></var> is
 greater than 59, return an error and abort these steps.</p></li>

 <!-- no need to check if <var>value<sub>4</sub></var> is greater than 999, since we know it had
 exactly three characters in the range 0-9, so we know it's a number in the range 0-999 -->

 <li><p>Let |result| be <var>value<sub>1</sub></var>&times;60&times;60 +
 <var>value<sub>2</sub></var>&times;60 + <var>value<sub>3</sub></var> +
 <var>value<sub>4</sub></var>&#x2215;1000. <!-- &#x00f7; is the division sign if people prefer that
 to the slash --></p></li>

 <li><p>Return |result|.</p></li>

</ol>


<h3 id=cue-text-parsing-rules algorithm>WebVTT cue text parsing rules</h3>

<p>A <dfn>WebVTT Node Object</dfn> is a conceptual construct used to represent components of <a>cue
text</a> so that its processing can be described without reference to the underlying syntax.</p>

<p>There are two broad classes of <a lt="WebVTT Node Object">WebVTT Node Objects</a>: <a lt="WebVTT
Internal Node Object">WebVTT Internal Node Objects</a> and <a lt="WebVTT Leaf Node Object">WebVTT
Leaf Node Objects</a>.</p>

<p><dfn lt="WebVTT Internal Node Object">WebVTT Internal Node Objects</dfn> are those that can
contain further <a lt="WebVTT Node Object">WebVTT Node Objects</a>. They are conceptually similar to
elements in HTML or the DOM. <a lt="WebVTT Internal Node Object">WebVTT Internal Node Objects</a>
have an ordered list of child <a lt="WebVTT Node Object">WebVTT Node Objects</a>. The <a>WebVTT
Internal Node Object</a> is said to be the <i>parent</i> of the children. Cycles do not occur; the
parent-child relationships so constructed form a tree structure. <a lt="WebVTT Internal Node
Object">WebVTT Internal Node Objects</a> also have an ordered list of <a lt="cue component class
names">class names</a>, known as their <dfn lt="WebVTT Node Object's applicable classes">applicable
classes</dfn>, and a language, known as their <dfn lt="WebVTT Node Object's applicable
language">applicable language</dfn>, which is to be interpreted as a BCP 47 language tag.
[[!BCP47]]</p>

<p class=note>User agents will add a language tag as the <a lt="WebVTT Node Object's applicable
language">applicable language</a> even if it is not a valid or not even well-formed language tag.
[[!BCP47]]</p>

<p>There are several concrete classes of <a lt="WebVTT Internal Node Object">WebVTT Internal Node
Objects</a>:</p>

<dl>

 <dt><dfn lt="List of WebVTT Node Objects">Lists of WebVTT Node Objects</dfn></dt>
 <dd>
  <p>These are used as root nodes for trees of <a lt="WebVTT Node Object">WebVTT Node
  Objects</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Class Object">WebVTT Class Objects</dfn></dt>
 <dd>
  <p>These represent spans of text (a <a>WebVTT cue class span</a>) in <a>cue text</a>, and are used
  to annotate parts of the cue with <a lt="WebVTT Node Object's applicable classes">applicable
  classes</a> without implying further meaning (such as italics or bold).</p>
 </dd>

 <dt><dfn lt="WebVTT Italic Object">WebVTT Italic Objects</dfn></dt>
 <dd>
  <p>These represent spans of italic text (a <a>WebVTT cue italics span</a>) in <a>WebVTT caption or
  subtitle cue text</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Bold Object">WebVTT Bold Objects</dfn></dt>
 <dd>
  <p>These represent spans of bold text (a <a>WebVTT cue bold span</a>) in <a>WebVTT caption or
  subtitle cue text</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Underline Object">WebVTT Underline Objects</dfn></dt>
 <dd>
  <p>These represent spans of underline text (a <a>WebVTT cue underline span</a>) in <a>WebVTT
  caption or subtitle cue text</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Ruby Object">WebVTT Ruby Objects</dfn></dt>
 <dd>
  <p>These represent spans of ruby (a <a>WebVTT cue ruby span</a>) in <a>WebVTT caption or subtitle
  cue text</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Ruby Text Object">WebVTT Ruby Text Objects</dfn></dt>
 <dd>
  <p>These represent spans of ruby text (a <a>WebVTT cue ruby text span</a>) in <a>WebVTT caption or
  subtitle cue text</a>.</p>
 </dd>

 <dt><dfn lt="WebVTT Voice Object">WebVTT Voice Objects</dfn></dt>
 <dd>
  <p>These represent spans of text associated with a specific voice (a <a>WebVTT cue voice span</a>)
  in <a>WebVTT caption or subtitle cue text</a>. A <a>WebVTT Voice Object</a> has a value, which is
  the name of the voice.</p>
 </dd>

 <dt><dfn lt="WebVTT Language Object">WebVTT Language Objects</dfn></dt>
 <dd>
  <p>These represent spans of text (a <a>WebVTT cue language span</a>) in <a>WebVTT caption or
  subtitle cue text</a>, and are used to annotate parts of the cue where the <a lt="WebVTT Node
  Object's applicable language">applicable language</a> might be different than the surrounding
  text's, without implying further meaning (such as italics or bold).</p>
 </dd>

</dl>

<p><dfn lt="WebVTT Leaf Node Object">WebVTT Leaf Node Objects</dfn> are those that contain data,
such as text, and cannot contain child <a lt="WebVTT Node Object">WebVTT Node Objects</a>.</p>

<p>There are two concrete classes of <a lt="WebVTT Leaf Node Object">WebVTT Leaf Node
Objects</a>:</p>

<dl>

 <dt><dfn lt="WebVTT Text Object">WebVTT Text Objects</dfn></dt>
 <dd>
  <p>A fragment of text. A <a>WebVTT Text Object</a> has a value, which is the text it
  represents.</p>
 </dd>

 <dt><dfn lt="WebVTT Timestamp Object">WebVTT Timestamp Objects</dfn></dt>
 <dd>
  <p>A timestamp. A <a>WebVTT Timestamp Object</a> has a value, in seconds and fractions of a
  second, which is the time represented by the timestamp.</p>
 </dd>

</dl>

<p>The <dfn>WebVTT cue text parsing rules</dfn> consist of the following algorithm. The input is a
string |input| supposedly containing <a>WebVTT caption or subtitle cue text</a>, and optionally a
fallback language |language|. This algorithm returns a <a>list of WebVTT Node Objects</a>.</p>

<ol algorithm="WebVTT cue text parsing">

 <li><p>Let |input| be the string being parsed.</p></li>

 <li><p>Let |position| be a pointer into |input|, initially pointing at the start of the
 string.</p></li>

 <li><p>Let |result| be a <a>list of WebVTT Node Objects</a>, initially empty.</p></li>

 <li><p>Let |current| be the <a>WebVTT Internal Node Object</a> |result|.</p></li>

 <li><p>Let |language stack| be a stack of language tags, initially empty.</p></li>

 <li><p>If |language| is set, set |result|'s <a lt="WebVTT Node Object's applicable
 language">applicable language</a> to |language|, and push |language| onto the |language
 stack|.</p></li>

 <li><p><i>Loop</i>: If |position| is past the end of |input|, return |result| and abort these
 steps.</p></li>

 <li><p>Let |token| be the result of invoking the <a>WebVTT cue text tokenizer</a>.</p></li>

 <li>

  <p>Run the appropriate steps given the type of |token|:</p>

  <dl>

   <dt>If |token| is a string</dt>
   <dd>

    <ol>

     <li><p>Create a <a>WebVTT Text Object</a> whose value is the value of the string token
     |token|.</p></li>

     <li><p>Append the newly created <a>WebVTT Text Object</a> to |current|.</p></li>

    </ol>

   </dd>

   <dt>If |token| is a start tag</dt>
   <dd>

    <p>How the start tag token |token| is processed depends on its tag name, as follows:</p>

    <dl>

     <dt>If the tag name is "<code>c</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Class Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>i</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Italic Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>b</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Bold Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>u</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Underline
      Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>ruby</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Ruby Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>rt</code>"</dt>
     <dd>
      <p>If |current| is a <a>WebVTT Ruby Object</a>, then <a lt="attach a WebVTT Internal Node
      Object">attach</a> a <a>WebVTT Ruby Text Object</a>.</p>
     </dd>

     <dt>If the tag name is "<code>v</code>"</dt>
     <dd>
      <p><a lt="attach a WebVTT Internal Node Object">Attach</a> a <a>WebVTT Voice Object</a>, and
      set its value to the token's annotation string, or the empty string if there is no annotation
      string.</p>
     </dd>

     <dt>If the tag name is "<code>lang</code>"</dt>
     <dd>
      <p>Push the value of the token's annotation string, or the empty string if there is no
      annotation string, onto the |language stack|; then <a lt="attach a WebVTT Internal Node
      Object">attach</a> a <a>WebVTT Language Object</a>.</p>
     </dd>

     <dt>Otherwise</dt>
     <dd>
      <p>Ignore the token.</p>
     </dd>

    </dl>

    <p>When the steps above say to <dfn>attach a WebVTT Internal Node Object</dfn> of a particular
    concrete class, the user agent must run the following steps:</p>

    <ol>

     <li><p>Create a new <a>WebVTT Internal Node Object</a> of the specified concrete
     class.</p></li>

     <li><p>Set the new object's list of <a lt="WebVTT Node Object's applicable classes">applicable
     classes</a> to the list of classes in the token, excluding any classes that are the empty
     string.</p></li>

     <li><p>Set the new object's <a lt="WebVTT Node Object's applicable language">applicable
     language</a> to the top entry on the |language stack|, if the stack is not empty.</p></li>

     <li><p>Append the newly created node object to |current|.</p></li>

     <li><p>Let |current| be the newly created node object.</p></li>

    </ol>

   </dd>

   <dt>If |token| is an end tag</dt>
   <dd>

    <p>If any of the following conditions is true, then let |current| be the parent node of
    |current|.</p>

    <ul class="brief">

     <li>The tag name of the end tag token |token| is "<code>c</code>" and |current| is a <a>WebVTT
     Class Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>i</code>" and |current| is a <a>WebVTT
     Italic Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>b</code>" and |current| is a <a>WebVTT
     Bold Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>u</code>" and |current| is a <a>WebVTT
     Underline Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>ruby</code>" and |current| is a
     <a>WebVTT Ruby Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>rt</code>" and |current| is a <a>WebVTT
     Ruby Text Object</a>.</li>

     <li>The tag name of the end tag token |token| is "<code>v</code>" and |current| is a <a>WebVTT
     Voice Object</a>.</li>

    </ul>

    <p>Otherwise, if the tag name of the end tag token |token| is "<code>lang</code>" and |current|
    is a <a>WebVTT Language Object</a>, then let |current| be the parent node of |current|, and pop
    the top value from the |language stack|.</p>

    <p>Otherwise, if the tag name of the end tag token |token| is "<code>ruby</code>" and |current|
    is a <a>WebVTT Ruby Text Object</a>, then let |current| be the parent node of the parent node of
    |current|.</p>

    <p>Otherwise, ignore the token.</p>

   </dd>

   <dt>If |token| is a timestamp tag</dt>
   <dd>

    <ol>

     <li><p>Let |input| be the tag value.</p></li>

     <li><p>Let |position| be a pointer into |input|, initially pointing at the start of the
     string.</p></li>

     <li><p><a>Collect a WebVTT timestamp</a>.</p></li>

     <li>

      <p>If that algorithm does not fail, and if |position| now points at the end of |input| (i.e.
      there are no trailing characters after the timestamp), then create a <a>WebVTT Timestamp
      Object</a> whose value is the collected time, then append it to |current|.</p>

      <p>Otherwise, ignore the token.</p>

     </li>

    </ol>

   </dd>

  </dl>

 </li>

 <li><p>Jump to the step labeled <i>loop</i>.</p></li>

</ol>

<p>The <dfn>WebVTT cue text tokenizer</dfn> is as follows. It emits a token, which is either a
string (whose value is a sequence of characters), a start tag (with a tag name, a list of classes,
and optionally an annotation), an end tag (with a tag name), or a timestamp tag (with a tag
value).</p>

<ol algorithm="WebVTT cue text tokenizer">

 <li><p>Let |input| and |position| be the same variables as those of the same name in the algorithm
 that invoked these steps.</p></li>

 <li><p>Let |tokenizer state| be <a>WebVTT data state</a>.</p></li>

 <li><p>Let |result| be the empty string.</p></li>

 <li><p>Let |classes| be an empty list.</p></li>

 <li>

  <p><i>Loop</i>: If |position| is past the end of |input|, let |c| be an end-of-file marker.
  Otherwise, let |c| be the character in |input| pointed to by |position|.</p>

  <p class="note">An end-of-file marker is not a Unicode character, it is used to end the
  tokenizer.</p>

 </li>

 <li>

  <p>Jump to the state given by |tokenizer state|:</p>

  <dl>

   <dt><dfn>WebVTT data state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+0026 AMPERSAND (&amp;)</dt>
     <dd>
      <p>Set |tokenizer state| to the <a>HTML character reference in data state</a>, and jump to the
      step labeled <i>next</i>.</p>
     </dd>

     <dt>U+003C LESS-THAN SIGN (&lt;)</dt>
     <dd>
      <p>If |result| is the empty string, then set |tokenizer state| to the <a>WebVTT tag state</a>
      and jump to the step labeled <i>next</i>.</p>
      <p>Otherwise, return a string token whose value is |result| and abort these steps.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Return a string token whose value is |result| and abort these steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |result| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>HTML character reference in data state</dfn></dt>

   <dd>

    <p>Attempt to <a>consume an HTML character reference</a>, with no <a>additional allowed
    character</a>.</p>

    <p>If nothing is returned, append a U+0026 AMPERSAND character (&amp;) to |result|.</p>

    <p>Otherwise, append the data of the character tokens that were returned to |result|.</p>

    <p>Then, in any case, set |tokenizer state| to the <a>WebVTT data state</a>, and jump to the
    step labeled <i>next</i>.</p>

   </dd>

   <dt><dfn>WebVTT tag state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+0009 CHARACTER TABULATION (tab) character</dt>
     <dt>U+000A LINE FEED (LF) character</dt>
     <dt>U+000C FORM FEED (FF) character</dt>
     <dt>U+0020 SPACE character</dt>
     <dd>
      <!-- assert: >result< is the empty string -->
      <p>Set |tokenizer state| to the <a>WebVTT start tag annotation state</a>, and jump to the step
      labeled <i>next</i>.</p>
     </dd>

     <dt>U+002E FULL STOP character (.)</dt>
     <dd>
      <!-- assert: >result< is the empty string -->
      <p>Set |tokenizer state| to the <a>WebVTT start tag class state</a>, and jump to the step
      labeled <i>next</i>.</p>
     </dd>

     <dt>U+002F SOLIDUS character (/)</dt>
     <dd>
      <p>Set |tokenizer state| to the <a>WebVTT end tag state</a>, and jump to the step labeled
      <i>next</i>.</p>
     </dd>

     <dt><a>ASCII digits</a></dt>
     <dd>
      <p>Set |result| to |c|, set |tokenizer state| to the <a>WebVTT timestamp tag state</a>, and
      jump to the step labeled <i>next</i>.</p>
     </dd>

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Return a start tag whose tag name is the empty string, with no classes and no annotation,
      and abort these steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Set |result| to |c|, set |tokenizer state| to the <a>WebVTT start tag state</a>, and jump
      to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>WebVTT start tag state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+0009 CHARACTER TABULATION (tab) character</dt>
     <dt>U+000C FORM FEED (FF) character</dt>
     <dt>U+0020 SPACE character</dt>
     <dd>
      <p>Set |tokenizer state| to the <a>WebVTT start tag annotation state</a>, and jump to the step
      labeled <i>next</i>.</p>
     </dd>

     <dt>U+000A LINE FEED (LF) character</dt>
     <dd>
      <p>Set |buffer| to |c|, set |tokenizer state| to the <a>WebVTT start tag annotation state</a>,
      and jump to the step labeled <i>next</i>.</p>
     </dd>

     <dt>U+002E FULL STOP character (.)</dt>
     <dd>
      <p>Set |tokenizer state| to the <a>WebVTT start tag class state</a>, and jump to the step
      labeled <i>next</i>.</p>
     </dd>

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Return a start tag whose tag name is |result|, with no classes and no annotation, and abort
      these steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |result| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>WebVTT start tag class state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+0009 CHARACTER TABULATION (tab) character</dt>
     <dt>U+000C FORM FEED (FF) character</dt>
     <dt>U+0020 SPACE character</dt>
     <dd>
      <p>Append to |classes| an entry whose value is |buffer|, set |buffer| to the empty string, set
      |tokenizer state| to the <a>WebVTT start tag annotation state</a>, and jump to the step
      labeled <i>next</i>.</p>
     </dd>

     <dt>U+000A LINE FEED (LF) character</dt>
     <dd>
      <p>Append to |classes| an entry whose value is |buffer|, set |buffer| to |c|, set |tokenizer
      state| to the <a>WebVTT start tag annotation state</a>, and jump to the step labeled
      <i>next</i>.</p>
     </dd>

     <dt>U+002E FULL STOP character (.)</dt>
     <dd>
      <p>Append to |classes| an entry whose value is |buffer|, set |buffer| to the empty string, and
      jump to the step labeled <i>next</i>.</p>
     </dd>

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Append to |classes| an entry whose value is |buffer|, then return a start tag whose tag
      name is |result|, with the classes given in |classes| but no annotation, and abort these
      steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |buffer| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>WebVTT start tag annotation state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+0026 AMPERSAND (&amp;)</dt>
     <dd>
      <p>Set |tokenizer state| to the <a>HTML character reference in annotation state</a>, and jump
      to the step labeled <i>next</i>.</p>
     </dd>

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Remove any leading or trailing <a>ASCII whitespace</a> characters from |buffer|, and
      replace any sequence of one or more consecutive <a>ASCII whitespace</a> characters in |buffer|
      with a single U+0020 SPACE character; then, return a start tag whose tag name is |result|,
      with the classes given in |classes|, and with |buffer| as the annotation, and abort these
      steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |buffer| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>HTML character reference in annotation state</dfn></dt>

   <dd>

    <p>Attempt to <a>consume an HTML character reference</a>, with the <a>additional allowed
    character</a> being U+003E GREATER-THAN SIGN (>).</p>

    <p>If nothing is returned, append a U+0026 AMPERSAND character (&amp;) to |buffer|.</p>

    <p>Otherwise, append the data of the character tokens that were returned to |buffer|.</p>

    <p>Then, in any case, set |tokenizer state| to the <a>WebVTT start tag annotation state</a>, and
    jump to the step labeled <i>next</i>.</p>

   </dd>

   <dt><dfn>WebVTT end tag state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <!-- should we ignore anything after spaces, tabs, and line feeds? -->

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Return an end tag whose tag name is |result| and abort these steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |result| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

   <dt><dfn>WebVTT timestamp tag state</dfn></dt>

   <dd>

    <p>Jump to the entry that matches the value of |c|:</p>

    <dl>

     <dt>U+003E GREATER-THAN SIGN character (>)</dt>
     <dd>
      <p>Advance |position| to the next character in |input|, then jump to the next "end-of-file
      marker" entry below.</p>
     </dd>

     <dt>End-of-file marker</dt>
     <dd>
      <p>Return a timestamp tag whose tag name is |result| and abort these steps.</p>
     </dd>

     <dt>Anything else</dt>
     <dd>
      <p>Append |c| to |result| and jump to the step labeled <i>next</i>.</p>
     </dd>

    </dl>

   </dd>

  </dl>

 </li>

 <li><p><i>Next</i>: Advance |position| to the next character in |input|.</p></li>

 <li><p>Jump to the step labeled <i>loop</i>.</p></li>

</ol>

<p>When the algorithm above says to attempt to <dfn>consume an HTML character reference</dfn>, it
means to attempt to <a>consume a character reference</a> as defined in HTML. [[!HTML]]</p>

<p>When the HTML specification says to consume a character, in this context, it means to advance
|position| to the next character in |input|. When it says to unconsume a character, it means to move
|position| back to the previous character in |input|. "EOF" is equivalent to the end-of-file marker
in this specification. Finally, this context is <em>not</em> "as part of an attribute" (when it
comes to handling a missing semicolon).</p>


<h3 id=dom-construction-rules><dfn>WebVTT cue text DOM construction rules</dfn></h3>

<p class="note">For the purpose of retrieving a <a>WebVTT cue</a>'s content via the
{{VTTCue/getCueAsHTML()}} method of the {{VTTCue}} interface, it needs to be parsed to a
{{DocumentFragment}}. This section describes how.</p>

<p>To convert a <a>list of WebVTT Node Objects</a> to a DOM tree for {{Document}} |owner|, user
agents must create a tree of DOM nodes that is isomorphous to the tree of <a lt="WebVTT Node
Object">WebVTT Node Objects</a>, with the following mapping of <a lt="WebVTT Node Object">WebVTT
Node Objects</a> to DOM nodes:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT Node Object</a></th>
   <th>DOM node</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td class=long><a>List of WebVTT Node Objects</a></td>
   <td class=long>{{DocumentFragment}} node.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Region Object</a></td>
   <td class=long>{{DocumentFragment}} node.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Class Object</a></td>
   <td class=long>HTML <a element>span</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Italic Object</a></td>
   <td class=long>HTML <a element>i</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Bold Object</a></td>
   <td class=long>HTML <a element>b</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Underline Object</a></td>
   <td class=long>HTML <a element>u</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Ruby Object</a></td>
   <td class=long>HTML <a element>ruby</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Ruby Text Object</a></td>
   <td class=long>HTML <a element>rt</a> element.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Voice Object</a></td>
   <td class=long>HTML <a element>span</a> element with a <a element-attr>title</a> attribute set to
   the <a>WebVTT Voice Object</a>'s value.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Language Object</a></td>
   <td class=long>HTML <a element>span</a> element with a <a element-attr>lang</a> attribute set to
   the <a>WebVTT Language Object</a>'s <a lt="WebVTT Node Object's applicable language">applicable
   language</a>.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Text Object</a></td>
   <td class=long>{{Text}} node whose {{CharacterData/data}} is the value of the <a>WebVTT Text
   Object</a>.</td>
  </tr>
  <tr>
   <td class=long><a>WebVTT Timestamp Object</a></td>
   <td class=long>{{ProcessingInstruction}} node whose {{ProcessingInstruction/target}} is
   "<code>timestamp</code>" and whose {{CharacterData/data}} is a <a>WebVTT timestamp</a>
   representing the value of the <a>WebVTT Timestamp Object</a>, with all optional components
   included, with one leading zero if the |hours| component is less than ten, and with no leading
   zeros otherwise.</td>
  </tr>
 </tbody>
</table>

<p>HTML elements created as part of the mapping described above must have their
{{Element/namespaceURI}} set to the <a>HTML namespace</a>, use the appropriate IDL interface as defined
in the HTML specification, and, if the corresponding <a>WebVTT Internal Node Object</a> has any <a
lt="WebVTT Node Object's applicable classes">applicable classes</a>, must have a <a
element-attr>class</a> attribute set to the string obtained by concatenating all those classes, each
separated from the next by a single U+0020 SPACE character.</p>

<p>The {{Node/ownerDocument}} attribute of all nodes in the DOM tree must be set to the given
document |owner|.</p>

<p>All characteristics of the DOM nodes that are not described above or dependent on characteristics
defined above must be left at their initial values.</p>


<h3 id=rules-for-extracting-the-chapter-title algorithm>WebVTT rules for extracting the chapter
title</h3>

<p>The <dfn>WebVTT rules for extracting the chapter title</dfn> for a <a>WebVTT cue</a> |cue| are as
follows:</p>

<ol algorithm="WebVTT rules for extracting the chapter title">

 <li><p>Let |nodes| be the <a>list of WebVTT Node Objects</a> obtained by applying the <a>WebVTT cue
 text parsing rules</a> to the |cue|'s <a>cue text</a>.</p></li>

 <li><p>Return the concatenation of the values of each <a>WebVTT Text Object</a> in |nodes|, in a
 pre-order, depth-first traversal, excluding <a lt="WebVTT Ruby Text Object">WebVTT Ruby Text
 Objects</a> and their descendants.</p></li>

</ol>


<h2 id=rendering>Rendering</h2>

<p class="note">This section describes in some detail how to visually render <a>WebVTT caption or
subtitle cues</a> in a user agent. The processing model is quite tightly linked to media elements in
HTML, where CSS is available. <a>User agents that do not support CSS</a> are expected to render
plain text only, without styling and positioning features. <a>User agents that do not support a full
HTML CSS engine</a> are expected to render an equivalent visual representation to what a user agent
with a full CSS engine would render.</p>


<h3 id=processing-model algorithm>Processing model</h3>

<p>The <dfn>rules for updating the display of WebVTT text tracks</dfn> render the <a>text tracks</a>
of a <a>media element</a> (specifically, a <a element>video</a> element), or of another playback
mechanism, by applying the steps below. All the <a lt="text track">text tracks</a> that use these
rules for a given <a>media element</a>, or other playback mechanism, are rendered together, to avoid
overlapping subtitles from multiple tracks. A fallback language |language| may be set when calling
this algorithm.</p>

<p class="note">In HTML, audio elements don't have a visual rendering area and therefore, this
algorithm will abort for audio elements. When authors do create WebVTT captions or subtitles for
audio resources, they need to publish them in a video element for rendering by the user agent.</p>

<p>The output of the steps below is a set of CSS boxes that covers the rendering area of the
<a>media element</a> or other playback mechanism, which user agents are expected to render in a
manner suiting the user.</p>

<p>The rules are as follows:</p>

<ol algorithm="rules for updating the display of WebVTT text tracks">

 <li><p>If the <a>media element</a> is an <a element>audio</a> element, or is another playback
 mechanism with no rendering area, abort these steps.</p></li>

 <li><p>Let |video| be the <a>media element</a> or other playback mechanism.</p></li>

 <li><p>Let |output| be an empty list of absolutely positioned CSS block boxes.</p></li>

 <li><p>If the user agent is <a lt="expose a user interface to the user">exposing a user
 interface</a> for |video|, add to |output| one or more completely transparent positioned CSS block
 boxes that cover the same region as the user interface.</p></li>

 <li><p>If the last time these rules were run, the user agent was not <a lt="expose a user interface
 to the user">exposing a user interface</a> for |video|, but now it is, optionally let |reset| be
 true. Otherwise, let |reset| be false.</p></li>

 <li><p>Let |tracks| be the subset of |video|'s <a>list of text tracks</a> that have as their
 <a>rules for updating the text track rendering</a> these <a>rules for updating the display of
 WebVTT text tracks</a>, and whose <a>text track mode</a> is <a lt="text track
 showing">showing</a>.</p></li>

 <li><p>Let |cues| be an empty list of <a lt="text track cue">text track cues</a>.</p></li>

 <li><p>For each track |track| in |tracks|, append to |cues| all the <a lt="text track cue">cues</a>
 from |track|'s <a lt="text track list of cues">list of cues</a> that have their <a>text track cue
 active flag</a> set.</p></li>

 <li><p>Let |regions| be an empty list of <a lt="WebVTT region">WebVTT regions</a>.</p></li>

 <li><p>For each track |track| in |tracks|, append to |regions| all the <a lt="WebVTT
 region">regions</a> with an identifier from |track|'s <a lt="text track list of regions">list of
 regions</a>.</p></li>

 <li><p>If |reset| is false, then, for each <a>WebVTT region</a> |region| in |regions| let
 |regionNode| be a <a>WebVTT region object</a>.</p></li>

 <li>
  <p>Apply the following steps for each |regionNode|:</p>

  <ol>
   <li>
    <p>Prepare some variables for the application of CSS properties to |regionNode| as follows:</p>

    <ul>
     <li><p>Let |regionWidth| be the <a>WebVTT region width</a>. Let |width| be
     ''|regionWidth|&#x2009;vw'' (''vw'' is a CSS unit). [[!CSS-VALUES]]</p></li>

     <li><p>Let |lineHeight| be ''6vh'' (''vh'' is a CSS unit) [[!CSS-VALUES]] and |regionHeight| be
     the <a>WebVTT region lines</a>. Let |lines| be |lineHeight| multiplied by
     |regionHeight|.</p></li>

     <li><p>Let |viewportAnchorX| be the x dimension of the <a>WebVTT region anchor</a> and
     |regionAnchorX| be the x dimension of the <a>WebVTT region anchor</a>. Let |leftOffset| be
     |regionAnchorX| multiplied by |width| divided by 100.0. Let |left| be |leftOffset| subtracted
     from ''|viewportAnchorX|&#x2009;vw''.</p></li>

     <li><p>Let |viewportAnchorY| be the y dimension of the <a>WebVTT region anchor</a> and
     |regionAnchorY| be the y dimension of the <a>WebVTT region anchor</a>. Let |topOffset| be
     |regionAnchorY| multiplied by |lines| divided by 100.0. Let |top| be |topOffset| subtracted
     from ''|viewportAnchorY|&#x2009;vh''.</p></li>
    </ul>
   </li>

   <li>
    <p>Apply the terms of the CSS specifications to |regionNode| within the following constraints,
    thus obtaining a CSS box |box| positioned relative to an initial containing block:</p>
    <ol>
     <li><p>No style sheets are associated with |regionNode|. (The regionNodes are subsequently
     restyled using style sheets after their boxes are generated, as described below.)</p></li>
     <li><p>Properties on |regionNode| have their values set as defined in the next section. (That
     section uses some of the variables whose values were calculated earlier in this
     algorithm.)</p></li>
     <li><p>The video viewport (and initial containing block) is video's rendering area.</p></li>
    </ol>
   </li>

   <li><p>Add the CSS box |box| to |output|.</p></li>
  </ol>
 </li>

 <li>
  <p>If |reset| is false, then, for each <a>WebVTT cue</a> |cue| in |cues|: if |cue|'s <a>text track
  cue display state</a> has a set of CSS boxes, then:</p>

  <ul>
   <li><p>If |cue|'s <a>WebVTT cue region</a> is not null, add those boxes to that region's |box|
   and remove |cue| from |cues|.</p></li>
   <li><p>Otherwise, add those boxes to |output| and remove |cue| from |cues|.</p></li>
  </ul>

 </li>

 <li>

  <p>For each <a>WebVTT cue</a> |cue| in |cues| that has not yet had corresponding CSS boxes added
  to |output|, in <a>text track cue order</a>, run the following substeps:</p>

  <ol>

   <li><p>Let |nodes| be the <a>list of WebVTT Node Objects</a> obtained by applying the <a>WebVTT
   cue text parsing rules</a>, with the fallback language |language| if provided, to the |cue|'s
   <a>cue text</a>.</p></li>

   <li>
    <p>If |cue|'s <a>WebVTT cue region</a> is null, run the following substeps:</p>

    <ol>

     <li><a>Apply WebVTT cue settings</a> to obtain CSS boxes |boxes| from |nodes|.</li>

     <li><p>Let |cue|'s <a>text track cue display state</a> have the CSS boxes in |boxes|.</p></li>

     <li><p>Add the CSS boxes in |boxes| to |output|.</p></li>

    </ol>
   </li>

   <li>
    <p>Otherwise, run the following substeps:</p>
    <ol>
     <li><p>Let |region| be |cue|'s <a>WebVTT cue region</a>.</p></li>

     <li><p>If |region|'s <a>WebVTT region scroll</a> setting is <a lt="WebVTT region scroll
     up">up</a> and |region| already has one child, set |region|'s 'transition-property' to
     ''transition-property/top'' and 'transition-duration' to ''0.433s''.</p></li>

     <li><p>Let |offset| be |cue|'s <a lt="cue computed position">computed position</a> multiplied
     by |region|'s <a>WebVTT region width</a> and divided by 100 (i.e. interpret it as a percentage
     of the region width).</p></li>

     <li>
      <p>Adjust |offset| using |cue|'s <a lt="cue computed position alignment">computed position
      alignment</a> as follows:</p>
      <dl class="switch">
       <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
       lt="WebVTT cue position center alignment">center alignment</a></dt>
       <dd><p>Subtract half of |region|'s <a>WebVTT region width</a> from |offset|.</p></dd>

       <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
       lt="WebVTT cue position line-right alignment">line-right alignment</a></dt>
       <dd><p>Subtract |region|'s <a>WebVTT region width</a> from |offset|.</p></dd>
      </dl>
     </li>

     <li><p>Let |left| be ''|offset|&#x2009;%''. [[!CSS-VALUES]]</p></li>

     <li><p><a>Obtain a set of CSS boxes</a> |boxes| positioned relative to an initial containing
     block.</p></li>

     <li><p>If there are no line boxes in |boxes|, skip the remainder of these substeps for |cue|.
     The cue is ignored.</p></li>

     <li><p>Let |cue|'s <a>text track cue display state</a> have the CSS boxes in |boxes|.</p></li>

     <li><p>Add the CSS boxes in |boxes| to |region|.</p></li>

     <li><p>If the CSS boxes |boxes| together have a height less than the height of the |region|
     box, let |diff| be the absolute difference between the two height values. Increase |top| by
     |diff| and re-apply it to |regionNode|.</p></li>
    </ol>
   </li>

  </ol>
 </li>

 <li><p>Return |output|.</p></li>

</ol>

<p>User agents may allow the user to override the above algorithm's positioning of cues, e.g. by
dragging them to another location on the <a element>video</a>, or even off the <a element>video</a>
entirely.</p>


<h3 id=processing-cue-settings algorithm>Processing cue settings</h3>

<p>When the processing algorithm above requires that the user agent <dfn>apply WebVTT cue
settings</dfn> to obtain CSS boxes from a <a>list of WebVTT Node Objects</a> |nodes|, the user agent
must run the following algorithm.</p>

<ol algorithm="apply WebVTT cue settings">

 <li><p>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
 direction">horizontal</a>, then let |writing-mode| be "horizontal-tb". Otherwise, if the <a>WebVTT
 cue writing direction</a> is <a lt="WebVTT cue vertical growing left writing direction">vertical
 growing left</a>, then let |writing-mode| be "vertical-rl". Otherwise, the <a>WebVTT cue writing
 direction</a> is <a lt="WebVTT cue vertical growing right writing direction">vertical growing
 right</a>; let |writing-mode| be "vertical-lr".</p></li>

 <li>

  <p>Determine the value of |maximum size| for |cue| as per the appropriate rules from the following
  list:</p>

  <dl class="switch">

   <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
   lt="WebVTT cue position line-left alignment">line-left</a></dt>
   <dd>
    <p>Let |maximum size| be the <a lt="cue computed position">computed position</a> subtracted from
    100.</p>
   </dd>

   <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
   lt="WebVTT cue position line-right alignment">line-right</a></dt>
   <dd>
    <p>Let |maximum size| be the <a lt="cue computed position">computed position</a>.</p>
   </dd>

   <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
   lt="WebVTT cue position center alignment">center</a>, and the <a lt="cue computed
   position">computed position</a> is less than or equal to 50</dt>
   <dd>
    <p>Let |maximum size| be the <a lt="cue computed position">computed position</a> multiplied by
    two.</p>
   </dd>

   <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
   lt="WebVTT cue position center alignment">center</a>, and the <a lt="cue computed
   position">computed position</a> is greater than <!-- or equal to --> 50</dt>
   <dd>
    <p>Let |maximum size| be the result of subtracting <a lt="cue computed position">computed
    position</a> from 100 and then multiplying the result by two.</p>
   </dd>

  </dl>

 </li>

 <li><p>If the <a>WebVTT cue size</a> is less than |maximum size|, then let |size| be <a>WebVTT cue
 size</a>. Otherwise, let |size| be |maximum size|.</p></li>

 <li><p>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
 direction">horizontal</a>, then let |width| be ''|size|&#x2009;vw'' and |height| be
 ''height/auto''. Otherwise, let |width| be ''width/auto'' and |height| be ''|size|&#x2009;vh''.
 (These are CSS values used by the next section to set CSS properties for the rendering; ''vw'' and
 ''vh'' are CSS units.) [[!CSS-VALUES]]</p></li>

 <li>

  <p>Determine the value of |x-position| or |y-position| for |cue| as per the appropriate rules from
  the following list:</p>

  <dl class="switch">

   <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
   direction">horizontal</a></dt>
   <dd>
    <dl class="switch">
     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position line-left alignment">line-left alignment</a></dt>
     <dd><p>Let |x-position| be the <a lt="cue computed position">computed position</a>.</p></dd>

     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position center alignment">center alignment</a></dt>
     <dd><p>Let |x-position| be the <a lt="cue computed position">computed position</a> minus half
     of |size|.</p></dd>

     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position line-right alignment">line-right alignment</a></dt>
     <dd><p>Let |x-position| be the <a lt="cue computed position">computed position</a> minus
     |size|.</p></dd>
    </dl>
   </dd>

   <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue vertical growing left writing
   direction">vertical growing left</a> or <a lt="WebVTT cue vertical growing right writing
   direction">vertical growing right</a></dt>
   <dd>
    <dl class="switch">
     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position line-left alignment">line-left alignment</a></dt>
     <dd><p>Let |y-position| be the <a lt="cue computed position">computed position</a>.</p></dd>

     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position center alignment">center alignment</a></dt>
     <dd><p>Let |y-position| be the <a lt="cue computed position">computed position</a> minus half
     of |size|.</p></dd>

     <dt>If the <a lt="cue computed position alignment">computed position alignment</a> is <a
     lt="WebVTT cue position line-right alignment">line-right alignment</a></dt>
     <dd><p>Let |y-position| be the <a lt="cue computed position">computed position</a> minus
     |size|.</p></dd>
    </dl>
   </dd>

  </dl>

 </li>

 <li>

  <p>Determine the value of whichever of |x-position| or |y-position| is not yet calculated for
  |cue| as per the appropriate rules from the following list:</p>

  <dl class="switch">

   <dt>If the <a>WebVTT cue snap-to-lines flag</a> is false</dt>
   <dd>
    <dl class="switch">

     <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
     direction">horizontal</a></dt>
     <dd><p>Let |y-position| be the <a lt="cue computed line">computed line</a>.</p></dd>

     <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue vertical growing left
     writing direction">vertical growing left</a> or <a lt="WebVTT cue vertical growing right
     writing direction">vertical growing right</a></dt>
     <dd><p>Let |x-position| be the <a lt="cue computed line">computed line</a>.</p></dd>

    </dl>
   </dd>

   <dt>If the <a>WebVTT cue snap-to-lines flag</a> is true</dt>
   <dd>
    <dl class="switch">

     <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
     direction">horizontal</a></dt>
     <dd><p>Let |y-position| be 0.</p></dd>

     <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue vertical growing left
     writing direction">vertical growing left</a> or <a lt="WebVTT cue vertical growing right
     writing direction">vertical growing right</a></dt>
     <dd><p>Let |x-position| be 0.</p></dd>

    </dl>
   </dd>

  </dl>

  <p class="note">These are not final positions, they are merely temporary positions used to
  calculate box dimensions below.</p>

 </li>

 <li><p>Let |left| be ''|x-position|&#x2009;vw'' and |top| be ''|y-position|&#x2009;vh''. (These are
 CSS values used by the next section to set CSS properties for the rendering; ''vw'' and ''vh'' are
 CSS units.) [[!CSS-VALUES]]</p></li>

 <li><p><a>Obtain a set of CSS boxes</a> |boxes| positioned relative to an initial containing
 block.</p></li>

 <li><p>If there are no line boxes in |boxes|, skip the remainder of these substeps for |cue|. The
 cue is ignored.</p></li>

 <li>

  <p>Adjust the positions of |boxes| according to the appropriate steps from the following list:</p>

  <dl class="switch">

   <dt>If |cue|'s <a>WebVTT cue snap-to-lines flag</a> is true</dt>

   <dd>

    <p>Many of the steps in this algorithm vary according to the <a>WebVTT cue writing
    direction</a>. Steps labeled "<strong>Horizontal</strong>" must be followed only when the
    <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
    direction">horizontal</a>, steps labeled "<strong>Vertical</strong>" must be followed when the
    <a>WebVTT cue writing direction</a> is either <a lt="WebVTT cue vertical growing left writing
    direction">vertical growing left</a> or <a lt="WebVTT cue vertical growing right writing
    direction">vertical growing right</a>, steps labeled "<strong>Vertical Growing Left</strong>"
    must be followed only when the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue vertical
    growing left writing direction">vertical growing left</a>, and steps labeled "<strong>Vertical
    Growing Right</strong>" must be followed only when the <a>WebVTT cue writing direction</a> is <a
    lt="WebVTT cue vertical growing right writing direction">vertical growing right</a>.</p>

    <ol>

     <li>

      <p><strong>Horizontal</strong>: Let |full dimension| be the height of |video|'s rendering
      area.</p>

      <p><strong>Vertical</strong>: Let |full dimension| be the width of |video|'s rendering
      area.</p>

     </li>

     <li>

      <p><strong>Horizontal</strong>: Let |step| be the height of the first line box in |boxes|.</p>

      <p><strong>Vertical</strong>: Let |step| be the width of the first line box in |boxes|.</p>

     </li>

     <li><p>If |step| is zero, then jump to the step labeled <i>done positioning</i> below.</p></li>

     <li><p>Let |line| be |cue|'s <a lt="cue computed line">computed line</a>.</p></li>

     <li><p>Round |line| to an integer by adding 0.5 and then flooring it.</p></li>

     <li><p><strong>Vertical Growing Left</strong>: Add one to |line| then negate it.</p></li>

     <li><p>Let |position| be the result of multiplying |step| and |line|.</p></li>

     <li><p><strong>Vertical Growing Left</strong>: Decrease |position| by the width of the bounding
     box of the boxes in |boxes|, then increase |position| by |step|.</p></li>

     <li>

      <p>If |line| is less than zero then increase |position| by |max dimension|, and negate
      |step|.</p>

     </li>

     <li>

      <p><strong>Horizontal</strong>: Move all the boxes in |boxes| down by the distance given by
      |position|.</p>

      <p><strong>Vertical</strong>: Move all the boxes in |boxes| right by the distance given by
      |position|.</p>

     </li>

     <li><p>Remember the position of all the boxes in |boxes| as their |specified
     position|.</p></li>

     <li><p>Let |title area| be a box that covers all of the |video|'s rendering area.</p></li>

     <li><p><i>Step loop</i>: If none of the boxes in |boxes| would overlap any of the boxes in
     |output|, and all of the boxes in |boxes| are entirely within the |title area| box, then jump
     to the step labeled <i>done positioning</i> below.</p></li>

     <li>

      <p><strong>Horizontal</strong>: If |step| is negative and the top of the first line box in
      |boxes| is now above the top of the |title area|, or if |step| is positive and the bottom of
      the first line box in |boxes| is now below the bottom of the |title area|, jump to the step
      labeled <i>switch direction</i>.</p>

      <p><strong>Vertical</strong>: If |step| is negative and the left edge of the first line box in
      |boxes| is now to the left of the left edge of the |title area|, or if |step| is positive and
      the right edge of the first line box in |boxes| is now to the right of the right edge of the
      |title area|, jump to the step labeled <i>switch direction</i>.</p>

     </li>

     <li>

      <p><strong>Horizontal</strong>: Move all the boxes in |boxes| down by the distance given by
      |step|. (If |step| is negative, then this will actually result in an upwards movement of the
      boxes in absolute terms.)</p>

      <p><strong>Vertical</strong>: Move all the boxes in |boxes| right by the distance given by
      |step|. (If |step| is negative, then this will actually result in a leftwards movement of the
      boxes in absolute terms.)</p>

     </li>

     <li><p>Jump back to the step labeled <i>step loop</i>.</p></li>

     <li><p><i>Switch direction</i>: If |switched| is true, then remove all the boxes in |boxes|,
     and jump to the step labeled <i>done positioning</i> below.</p></li>

     <li><p>Otherwise, move all the boxes in |boxes| back to their |specified position| as
     determined in the earlier step.</p></li>

     <li><p>Negate |step|.</p></li>

     <li><p>Set |switched| to true.</p></li>

     <li><p>Jump back to the step labeled <i>step loop</i>.</p></li>

    </ol>

   </dd>

   <dt>If |cue|'s <a>WebVTT cue snap-to-lines flag</a> is false</dt>
   <dd>

    <ol>

     <li><p>Let |bounding box| be the bounding box of the boxes in |boxes|.</p></li>

     <li>
      <p>Run the appropriate steps from the following list:</p>

      <dl class="switch">
       <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue horizontal writing
       direction">horizontal</a></dt>
       <dd>
        <dl class="switch">
         <dt>If the <a>WebVTT cue line alignment</a> is <a lt="WebVTT cue line center
         alignment">center alignment</a></dt>
         <dd><p>Move all the boxes in |boxes| up by half of the height of |bounding box|.</p></dd>

         <dt>If the <a>WebVTT cue line alignment</a> is <a lt="WebVTT cue line end alignment">end
         alignment</a></dt>
         <dd><p>Move all the boxes in |boxes| up by the height of |bounding box|.</p></dd>
        </dl>
       </dd>

       <dt>If the <a>WebVTT cue writing direction</a> is <a lt="WebVTT cue vertical growing left
       writing direction">vertical growing left</a> or <a lt="WebVTT cue vertical growing right
       writing direction">vertical growing right</a></dt>
       <dd>
        <dl class="switch">
         <dt>If the <a>WebVTT cue line alignment</a> is <a lt="WebVTT cue line center
         alignment">center alignment</a></dt>
         <dd><p>Move all the boxes in |boxes| left by half of the width of |bounding box|.</p></dd>

         <dt>If the <a>WebVTT cue line alignment</a> is <a lt="WebVTT cue line end alignment">end
         alignment</a></dt>
         <dd><p>Move all the boxes in |boxes| left by the width of |bounding box|.</p></dd>
        </dl>
       </dd>
      </dl>
     </li>

     <li><p>If none of the boxes in |boxes| would overlap any of the boxes in |output|, and all the
     boxes in |boxes| are within the |video|'s rendering area, then jump to the step labeled <i>done
     positioning</i> below.</p></li>

     <li><p>If there is a position to which the boxes in |boxes| can be moved while maintaining the
     relative positions of the boxes in |boxes| to each other such that none of the boxes in |boxes|
     would overlap any of the boxes in |output|, and all the boxes in |boxes| would be within the
     |video|'s rendering area, then move the boxes in |boxes| to the closest such position to their
     current position, and then jump to the step labeled <i>done positioning</i> below. If there are
     multiple such positions that are equidistant from their current position, use the highest one
     amongst them; if there are several at that height, then use the leftmost one amongst
     them.</p></li>

     <li><p>Otherwise, jump to the step labeled <i>done positioning</i> below. (The boxes will
     unfortunately overlap.)</p></li>

    </ol>

   </dd>

  </dl>

 </li>

 <li><p><i>Done positioning</i>: Return |boxes|.</p></li>

</ol>


<h3 id=obtaining-css-boxes algorithm>Obtaining CSS boxes</h3>

<p>When the processing algorithm above requires that the user agent <dfn>obtain a set of CSS
boxes</dfn> |boxes|, then apply the terms of the CSS specifications to |nodes| within the following
constraints: [[!CSS22]]</p>

<ul>

 <li><p>The <i>document tree</i> is the tree of <a lt="WebVTT Node Object">WebVTT Node Objects</a>
 rooted at |nodes|.</p></li>

 <li>
  <p>For the purpose of selectors in STYLE blocks of a WebVTT file, the style sheet must apply to a
  hypothetical document that contains a single empty element with no explicit name, no namespace, no
  attributes, no classes, no IDs, and unknown primary language, that acts like the <a>media
  element</a> for the <a>text tracks</a> that were sourced from the given WebVTT file. The selectors
  must not match other <a>text tracks</a> for the same <a>media element</a>. In this hypothetical
  document, the element must not match any selector that would match the element itself.</p>

  <p class=note>This element exists only to be the <a spec=selectors>originating element</a> for the
  ''::cue'', ''::cue()'', ''::cue-region'' and ''::cue-region()'' pseudo-elements.</p>
 </li>

 <li>
  <p>For the purpose of determining the <a spec=css-cascade>cascade</a> of the declarations in STYLE
  blocks of a WebVTT file, the relative order of appearance of the style sheets must be the same
  order as they were added to the collection, and the order of appearance of the collection must be
  after any style sheets that apply to the associated <a element>video</a> element's document.</p>

  <div class=example>
   <p>For example, given the following (invalid) HTML document:</p>

   <pre>
   &lt;!doctype html>
   &lt;title>Invalid cascade example&lt;/title>
   &lt;video controls autoplay src="video.webm">
    &lt;track default src="track.vtt">
   &lt;/video>
   &lt;style>
    ::cue { color:red }
   &lt;/style>
   </pre>

   <p>...and the "track.vtt" file contains:</p>

   <pre>
   WEBVTT

   STYLE
   ::cue { color:lime }

   00:00:00.000 --> 00:00:25.000
   Red or green?
   </pre>

   <p>The ''color:lime'' declaration would win, because it is last in the <a
   spec=css-cascade>cascade</a>, even though the <a element>style</a> element is after the <a
   element>video</a> element in the document order.</p>
  </div>
 </li>

 <li>

  <p id=style-no-external-resources>For the purpose of resolving URLs in STYLE blocks of a WebVTT
  file, or any URLs in resources referenced from STYLE blocks of a WebVTT file, if the URL's scheme
  is not "<code>data</code>", then the user agent must act as if the URL failed to resolve.</p>

  <p><strong class=advisement>Supporting external resources with ''@import'' or 'background-image'
  would be a new ability for <a>media elements</a> and <a element>track</a> elements to issue
  network requests as the user watches the video, which could be a privacy issue.</strong></p>

 </li>

 <li><p>For the purposes of processing by the CSS specification, <a lt="WebVTT Internal Node
 Object">WebVTT Internal Node Objects</a> are equivalent to elements with the same
 contents.</p></li>

 <li>For the purposes of processing by the CSS specification, <a lt="WebVTT Text Object">WebVTT Text
 Objects</a> are equivalent to {{Text}} nodes.</li>

 <li>No style sheets are associated with |nodes|. (The nodes are subsequently restyled using style
 sheets after their boxes are generated, as described below.)</li>

 <li>The children of the |nodes| must be wrapped in an anonymous box whose 'display' property has
 the value ''display/inline''. This is the <dfn>WebVTT cue background box</dfn>.</li>

 <li>Runs of children of <a lt="WebVTT Ruby Object">WebVTT Ruby Objects</a> that are not <a
 lt="WebVTT Ruby Text Object">WebVTT Ruby Text Objects</a> must be wrapped in anonymous boxes whose
 'display' property has the value ''display/ruby-base''. [[!CSS3-RUBY]]</li>

 <li>Properties on <a lt="WebVTT Node Object">WebVTT Node Objects</a> have their values set as
 defined in the next section. (That section uses some of the variables whose values were calculated
 earlier in this algorithm.)</li>

 <li>Text runs must be wrapped according to the CSS line-wrapping rules.</li>

 <li>The video viewport (and initial containing block) is |video|'s rendering area.</li>

</ul>

<p>Let |boxes| be the boxes generated as descendants of the initial containing block, along with
their positions.</p>


<h3 id=applying-css-properties algorithm>Applying CSS properties to <a lt="WebVTT Node
Object">WebVTT Node Objects</a></h3>

<p>When following the <a>rules for updating the display of WebVTT text tracks</a>, user agents must
set properties of <a lt="WebVTT Node Object">WebVTT Node Objects</a> at the CSS user agent cascade
layer as defined in this section. [[!CSS22]]</p>

<p>Initialize the (root) <a>list of WebVTT Node Objects</a> with the following CSS settings:</p>

<ul>
 <li>the 'position' property must be set to ''position/absolute''</li>
 <li>the 'unicode-bidi' property must be set to ''unicode-bidi/plaintext''</li>
 <li>the 'writing-mode' property must be set to |writing-mode|</li>
 <li>the 'top' property must be set to |top|</li>
 <li>the 'left' property must be set to |left|</li>
 <li>the 'width' property must be set to |width|</li>
 <li>the 'height' property must be set to |height|</li>
 <li>the 'overflow-wrap' property must be set to ''overflow-wrap/break-word''</li>
 <li>the 'text-wrap' property must be set to ''text-wrap/balance'' [[!CSS-TEXT-4]]</li>
</ul>

<p>The variables |writing-mode|, |top|, |left|, |width|, and |height| are the values with those
names determined by the <a>rules for updating the display of WebVTT text tracks</a> for the
<a>WebVTT cue</a> from whose <a lt="cue text">text</a> the <a>list of WebVTT Node Objects</a> was
constructed.</p>

<p>The 'text-align' property on the (root) <a>list of WebVTT Node Objects</a> must be set to the
value in the second cell of the row of the table below whose first cell is the value of the
corresponding <a lt="text track cue">cue</a>'s <a>WebVTT cue text alignment</a>:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT cue text alignment</a></th>
   <th>'text-align' value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT cue start alignment">Start alignment</a></td>
   <td>''text-align/start''</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue center alignment">Center alignment</a></td>
   <td>''text-align/center''</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue end alignment">End alignment</a></td>
   <td>''text-align/end''</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue left alignment">Left alignment</a></td>
   <td>''text-align/left''</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue right alignment">Right alignment</a></td>
   <td>''text-align/right''</td>
  </tr>
 </tbody>
</table>

<p>The 'font' shorthand property on the (root) <a>list of WebVTT Node Objects</a> must be set to
''5vh sans-serif''. [[!CSS-VALUES]]</p>

<p>The 'color' property on the (root) <a>list of WebVTT Node Objects</a> must be set to
''rgba(255,255,255,1)''. [[!CSS3-COLOR]]</p>

<p>The 'background' shorthand property on the <a>WebVTT cue background box</a> and on <a>WebVTT Ruby
Text Objects</a> must be set to ''rgba(0,0,0,0.8)''. [[!CSS3-COLOR]]</p>

<p>The 'white-space' property on the (root) <a>list of WebVTT Node Objects</a> must be set to
''white-space/pre-line''. [[!CSS22]]</p>

<p>The 'font-style' property on <a lt="WebVTT Italic Object">WebVTT Italic Objects</a> must be set
to ''font-style/italic''.</p>

<p>The 'font-weight' property on <a lt="WebVTT Bold Object">WebVTT Bold Objects</a> must be set to
''font-weight/bold''.</p>

<p>The 'text-decoration' property on <a lt="WebVTT Underline Object">WebVTT Underline Objects</a>
must be set to ''text-decoration/underline''.</p>

<p>The 'display' property on <a lt="WebVTT Ruby Object">WebVTT Ruby Objects</a> must be set to
''display/ruby''. [[!CSS3-RUBY]]</p>

<p>The 'display' property on <a lt="WebVTT Ruby Text Object">WebVTT Ruby Text Objects</a> must be
set to ''display/ruby-text''. [[!CSS3-RUBY]]</p>

<p>Every <a>WebVTT region object</a> is initialized with the following CSS settings:</p>

<ul>
 <li>the 'position' property must be set to ''position/absolute''</li>
 <li>the 'writing-mode' property must be set to ''writing-mode/horizontal-tb''</li>
 <li>the 'background' shorthand property must be set to ''rgba(0,0,0,0.8)''</li>
 <li>the 'overflow-wrap' property must be set to ''overflow-wrap/break-word''</li>
 <li>the 'font' shorthand property must be set to ''5vh sans-serif''</li>
 <li>the 'color' property must be set to ''rgba(255,255,255,1)''</li>
 <li>the 'overflow' property must be set to ''overflow/hidden''</li>
 <li>the 'width' property must be set to |width|</li>
 <li>the 'min-height' property must be set to ''0px''</li>
 <li>the 'max-height' property must be set to |height|</li>
 <li>the 'left' property must be set to |left|</li>
 <li>the 'top' property must be set to |top|</li>
 <li>the 'display' property must be set to ''display/inline-flex''</li>
 <li>the 'flex-flow' property must be set to ''flex-flow/column''</li>
 <li>the 'justify-content' property must be set to ''justify-content/flex-end''</li>
</ul>

<p>The variables |width|, |height|, |top|, and |left| are the values with those names determined by
the <a>rules for updating the display of WebVTT text tracks</a> for the <a>WebVTT region</a> from
which the <a>WebVTT region object</a> was constructed.</p>

<p>The children of every <a>WebVTT region object</a> are further initialized with these CSS
settings:</p>

<ul>
 <li>the 'position' property must be set to ''position/relative''</li>
 <li>the 'unicode-bidi' property must be set to ''unicode-bidi/plaintext''</li>
 <li>the 'width' property must be set to ''width/auto''</li>
 <li>the 'height' property must be set to |height|</li>
 <li>the 'left' property must be set to |left|</li>
 <li>the 'text-align' property must be set as described for the root <a>List of WebVTT Node
 Objects</a> not part of a region</li>
</ul>

<p>All other non-inherited properties must be set to their initial values; inherited properties on
the root <a>list of WebVTT Node Objects</a> must inherit their values from the <a>media element</a>
for which the <a>WebVTT cue</a> is being rendered, if any. If there is no <a>media element</a> (i.e.
if the <a>text track</a> is being rendered for another media playback mechanism), then inherited
properties on the root <a>list of WebVTT Node Objects</a> and the <a lt="WebVTT region
object">WebVTT region objects</a> must take their initial values.</p>

<p>If there are style sheets that apply to the <a>media element</a> or other playback mechanism,
then they must be interpreted as defined in the next section.</p>


<h2 id=css-extensions>CSS extensions</h2>

<p class="note">This section specifies some CSS pseudo-elements and pseudo-classes and how they
apply to WebVTT. This section does not apply to <a>user agents that do not support CSS</a>.</p>


<h3 id=css-extensions-introduction>Introduction</h3>

<p><i>This section is non-normative.</i></p>

<p>The ''::cue'' pseudo-element represents a cue.</p>

<p>The ''::cue(|selector|)'' pseudo-element represents a cue or element inside a cue that match the
given selector.</p>

<p>The ''::cue-region'' pseudo-element represents a region.</p>

<p>The ''::cue-region(|selector|)'' pseudo-element represents a region or element inside a region
that match the given selector.</p>

<p class="note">Similarly to all other pseudo-elements, these pseudo-elements are not directly
present in the <{video}> element's document tree.</p>

<p>The '':past'' and '':future'' pseudo-classes can be used in ''::cue(|selector|)'' to match
<a>WebVTT Internal Node Objects</a> based on the <a>current playback position</a>.</p>

<div class="example" id="example-cue-selector">
 <p>The following table shows examples of what can be selected with a given selector, together with
 WebVTT syntax to produce the relevant objects.</p>

 <table class="data">
  <thead>
   <tr>
    <th>Selector (CSS syntax example)</th>
    <th>Matches (WebVTT syntax example)</th>
   </tr>
  </thead>
  <tbody>
   <tr>
    <td class=long>
     <p>''::cue''</p>

     <pre>
     video::cue {
       color: yellow;
     }</pre>
    </td>
    <td class=long>
     <p>Any <a>list of WebVTT Node Objects</a>.</p>

     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     Yellow!

     00:00:08.000 --> 00:00:16.000
     Also yellow!
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p><a>ID selector</a> in ''::cue()''</p>

     <pre>
     video::cue(#cue1) {
       color: yellow;
     }</pre>
    </td>
    <td class=long>
     <p>Any <a>list of WebVTT Node Objects</a> with the cue's <a>text track cue identifier</a>
     matching the given ID.</p>

     <pre>
     WEBVTT

     cue1
     00:00:00.000 --> 00:00:08.000
     Yellow!
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p><a>Type selector</a> in ''::cue()''</p>

     <pre>
     video::cue(c),
     video::cue(i),
     video::cue(b),
     video::cue(u),
     video::cue(ruby),
     video::cue(rt),
     video::cue(v),
     video::cue(lang) {
       color: yellow;
     }
     </pre>
    </td>
    <td class=long>
     <p><a>WebVTT Internal Node Objects</a> (except the root <a>list of WebVTT Node Objects</a>)
     with the given name.</p>

     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     &lt;c>Yellow!&lt;/c>
     &lt;i>Yellow!&lt;/i>
     &lt;u>Yellow!&lt;/u>
     &lt;b>Yellow!&lt;/b>
     &lt;u>Yellow!&lt;/u>
     &lt;ruby>Yellow! &lt;rt>Yellow!&lt;/rt>&lt;/ruby>
     &lt;v Kathryn>Yellow!&lt;/v>
     &lt;lang en>Yellow!&lt;/lang>
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p><a>Class selector</a> in ''::cue()''</p>

     <pre>
     video::cue(.loud) {
       color: yellow;
     }</pre>
    </td>
    <td class=long>
     <p><a>WebVTT Internal Node Objects</a> (except the root <a>list of WebVTT Node Objects</a>)
     with the given <a lt="WebVTT Node Object's applicable classes">applicable classes</a>.</p>

     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     &lt;c.loud>Yellow!&lt;/c>
     &lt;i.loud>Yellow!&lt;/i>
     &lt;u.loud>Yellow!&lt;/u>
     &lt;b.loud>Yellow!&lt;/b>
     &lt;u.loud>Yellow!&lt;/u>
     &lt;ruby.loud>Yellow! &lt;rt.loud>Yellow!&lt;/rt>&lt;/ruby>
     &lt;v.loud Kathryn>Yellow!&lt;/v>
     &lt;lang.loud en>Yellow!&lt;/lang>
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p><a>Attribute selector</a> in ''::cue()''</p>

     <pre>
     video::cue([lang="en-US"]) {
       color: yellow;
     }
     video::cue(lang[lang="en-GB"]) {
       color: cyan;
     }
     video::cue(v[voice="Kathryn"] {
       color: lime;
     }
     </pre>
    </td>
    <td class=long>
     <p>For "lang", the root <a>list of WebVTT Node Objects</a> or <a>WebVTT Language Object</a>
     with the given <a lt="WebVTT Node Object's applicable language">applicable language</a>; for
     "voice", the <a>WebVTT Voice Object</a> with the given voice.</p>

     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     Yellow!

     00:00:08.000 --> 00:00:16.000
     &lt;lang en-GB>Cyan!&lt;/lang>

     00:00:16.000 --> 00:00:24.000
     &lt;v Kathryn>I like lime.&lt;/v>
     </pre>

     <p>The <a lt="WebVTT Node Object's applicable language">applicable language</a> for the <a>list
     of WebVTT Node Objects</a> can be set by the <{track/srclang}> attribute in HTML.</p>

     <pre>
     &lt;video ...>
      &lt;track src="example-attr.vtt"
             srclang="en-US" default>
     &lt;/video>
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p>'':lang()'' pseudo-class in ''::cue()''</p>
     <pre>
     video::cue(:lang(en)) {
       color: yellow;
     }
     video::cue(:lang(en-GB)) {
       color: cyan;
     }
     </pre>
    </td>
    <td class=long>
     <p><a>WebVTT Internal Node Objects</a> with an <a lt="WebVTT Node Object's applicable
     language">applicable language</a> matching the given language range.</p>
     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     Yellow!

     00:00:08.000 --> 00:00:16.000
     &lt;lang en-GB>Cyan!&lt;/lang>
     </pre>

     <p>As above, the <a lt="WebVTT Node Object's applicable language">applicable language</a> for
     the <a>list of WebVTT Node Objects</a> can be set by the <{track/srclang}> attribute in
     HTML.</p>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p>'':past'' and '':future'' pseudo-classes in ''::cue()''</p>

     <pre>
     video::cue(:past) {
       color: yellow;
     }
     video::cue(:future) {
       color: cyan;
     }
     </pre>
    </td>
    <td class=long>
     <p>In cues that have <a>WebVTT Timestamp Objects</a>, <a>WebVTT Internal Node Objects</a>,
     depending on the <a>current playback position</a>.</p>

     <pre>
     WEBVTT

     00:00:00.000 --> 00:00:08.000
     &lt;c>No match (no timestamps)&lt;/c>

     00:00:08.000 --> 00:00:16.000
     No match &lt;00:00:12.000> (no elements)

     00:00:16.000 --> 00:00:24.000
     &lt;00:00:16.000> &lt;c>This&lt;/c>
     &lt;00:00:18.000> &lt;c>can&lt;/c>
     &lt;00:00:20.000> &lt;c>match&lt;/c>
     &lt;00:00:22.000> &lt;c>:past/:future&lt;/c>
     &lt;00:00:24.000>
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p>''::cue-region''</p>

     <pre>
     video::cue-region {
       color: yellow;
     }</pre>
    </td>
    <td class=long>
     <p>Any region (list of <a>WebVTT region objects</a>).</p>

     <pre>
     WEBVTT

     REGION
     id:editor-comments
     regionanchor:0%,0%
     viewportanchor:0%,0%

     00:00:00.000 --> 00:00:08.000
     No match (normal cue)

     00:00:08.000 --> 00:00:16.000 region:editor-comments
     Yellow!
     </pre>
    </td>
   </tr>
   <tr>
    <td class=long>
     <p><a>ID selector</a> in ''::cue-region()''</p>

     <pre>
     video::cue-region(#scroll) {
       color: cyan;
     }</pre>
    </td>
    <td class=long>
     <p>Any region (list of <a>WebVTT region objects</a>) with a <a>WebVTT region identifier</a>
     matching the given ID.</p>

     <pre>
     WEBVTT

     REGION
     id:editor-comments
     width: 40%
     regionanchor:0%,100%
     viewportanchor:10%,90%

     REGION
     id:scroll
     width: 40%
     regionanchor:100%,100%
     viewportanchor:90%,90%
     scroll:up

     00:00:00.000 --> 00:00:08.000
     No match (normal cue)

     00:00:08.000 --> 00:00:16.000 region:editor-comments
     Yellow!

     00:00:10.000 --> 00:00:16.000 region:scroll
     Over here it's Cyan!
     </pre>
    </td>
   </tr>
  </tbody>
 </table>
</div>


<h3 id="css-extensions-processing-model">Processing model</h3>

<p>When a user agent is rendering one or more <a lt="WebVTT cue">WebVTT cues</a> according to the
<a>rules for updating the display of WebVTT text tracks</a>, <a lt="WebVTT Node Object">WebVTT Node
Objects</a> in the <a>list of WebVTT Node Objects</a> used in the rendering can be matched by
certain pseudo-selectors as defined below. These selectors can begin or stop matching individual <a
lt="WebVTT Node Object">WebVTT Node Objects</a> while a <a lt="text track cue">cue</a> is being
rendered, even in between applications of the <a>rules for updating the display of WebVTT text
tracks</a> (which are only run when the set of active cues changes). User agents that support the
pseudo-element described below must dynamically update renderings accordingly. When either
'white-space' or one of the properties corresponding to the 'font' shorthand (including
'line-height') changes value, then the <a>WebVTT cue</a>'s <a>text track cue display state</a> must
be emptied and the <a>text track</a>'s <a>rules for updating the text track rendering</a> must be
immediately rerun.</p>

<p>Pseudo-elements apply to elements that are matched by selectors. For the purpose of this section,
that element is the <i>matched element</i>. The pseudo-elements defined in the following sections
affect the styling of parts of <a lt="WebVTT cue">WebVTT cues</a> that are being rendered for the
<i>matched element</i>.</p>

<p class="note">If the <i>matched element</i> is not a <a element>video</a> element, the
pseudo-elements defined below won't have any effect according to this specification.</p>

<p>A CSS user agent that implements the <a>text tracks</a> model must implement the ''::cue'',
''::cue(|selector|)'', ''::cue-region'' and ''::cue-region(|selector|)'' pseudo-elements, and the
'':past'' and '':future'' pseudo-classes.</p>


<h4 id=the-cue-pseudo-element>The ''::cue'' pseudo-element</h4>

<p>The <dfn selector>::cue</dfn> pseudo-element (with no argument) matches any <a>list of WebVTT Node
Objects</a> constructed for the <i>matched element</i>, with the exception that the properties
corresponding to the 'background' shorthand must be applied to the <a>WebVTT cue background box</a>
rather than the <a>list of WebVTT Node Objects</a>.</p>

<p>The following properties apply to the ''::cue'' pseudo-element with no argument; other properties
set on the pseudo-element must be ignored:</p>

<ul class="brief">
 <li>'color'</li>
 <li>'opacity'</li>
 <li>'visibility'</li>
 <li>the properties corresponding to the 'text-decoration' shorthand</li>
 <li>'text-shadow'</li>
 <li>the properties corresponding to the 'background' shorthand</li>
 <li>the properties corresponding to the 'outline' shorthand</li>
 <li>the properties corresponding to the 'font' shorthand, including 'line-height'</li>
 <li>'white-space'</li>
 <li>'text-combine-upright'</li>
 <li>'ruby-position'</li>
 <!-- add more... -->
 <!-- definitely not: display, float, position, top, left, right, bottom, width, height, margin-top,
 margin-bottom, margin-left, margin-right, clip, clear, content, cursor, direction, max-height,
 min-height, max-width, min-width, orphans, overflow, page-break-*, text-align, unicode-bidi,
 widows, z-index -->
</ul>

<p>The <dfn selector>::cue(|selector|)</dfn> pseudo-element with an argument must have an argument that
consists of a CSS selector [[!SELECTORS4]]. It matches any <a>WebVTT Internal Node Object</a>
constructed for the <i>matched element</i> that also matches the given CSS selector, with the nodes
being treated as follows:</p>

<ul>

 <li><p>The <i>document tree</i> against which the selectors are matched is the tree of <a
 lt="WebVTT Node Object">WebVTT Node Objects</a> rooted at the <a>list of WebVTT Node Objects</a>
 for the cue.</p></li>

 <li><p><a lt="WebVTT Internal Node Object">WebVTT Internal Node Objects</a> are elements in the
 tree.</p></li>

 <li><a lt="WebVTT Leaf Node Object">WebVTT Leaf Node Objects</a> cannot be matched.</li>

 <li>

  <p>For the purposes of element type selectors, the names of <a lt="WebVTT Internal Node
  Object">WebVTT Internal Node Objects</a> are as given by the following table, where objects having
  the concrete class given in a cell in the first column have the name given by the second column of
  the same row:</p>

  <table class="complex data">
   <thead>
    <tr>
     <th>Concrete class</th>
     <th>Name</th>
    </tr>
   </thead>
   <tbody>
    <tr>
     <td><a lt="WebVTT Class Object">WebVTT Class Objects</a></td>
     <td><code>c</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Italic Object">WebVTT Italic Objects</a></td>
     <td><code>i</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Bold Object">WebVTT Bold Objects</a></td>
     <td><code>b</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Underline Object">WebVTT Underline Objects</a></td>
     <td><code>u</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Ruby Object">WebVTT Ruby Objects</a></td>
     <td><code>ruby</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Ruby Text Object">WebVTT Ruby Text Objects</a></td>
     <td><code>rt</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Voice Object">WebVTT Voice Objects</a></td>
     <td><code>v</code></td>
    </tr>
    <tr>
     <td><a lt="WebVTT Language Object">WebVTT Language Objects</a></td>
     <td><code>lang</code></td>
    </tr>
    <tr>
     <td>Other elements (specifically, <a lt="list of WebVTT Node Objects">lists of WebVTT Node
     Objects</a>)</td>
     <td>No explicit name.</td>
    </tr>
   </tbody>
  </table>

 </li>

 <li><p>For the purposes of element type and universal selectors, <a lt="WebVTT Internal Node
 Object">WebVTT Internal Node Objects</a> are considered as being in the namespace expressed as the
 empty string.</p></li>

 <li><p>For the purposes of attribute selector matching, <a lt="WebVTT Internal Node Object">WebVTT
 Internal Node Objects</a> have no attributes, except for <a lt="WebVTT Voice Object">WebVTT Voice
 Objects</a>, which have a single attribute named "<code>voice</code>" whose value is the value of
 the <a>WebVTT Voice Object</a>, <a lt="WebVTT Language Object">WebVTT Language Objects</a>, which
 have a single attribute named "<code>lang</code>" whose value is the object's <a lt="WebVTT Node
 Object's applicable language">applicable language</a>, and <a lt="list of WebVTT Node
 Objects">lists of WebVTT Node Objects</a> that have a non-empty <a lt="WebVTT Node Object's
 applicable language">applicable language</a>, which have a single attribute named
 "<code>lang</code>" whose value is the object's <a lt="WebVTT Node Object's applicable
 language">applicable language</a>.</p></li>

 <li><p>For the purposes of class selector matching, <a lt="WebVTT Internal Node Object">WebVTT
 Internal Node Objects</a> have the classes described as the <a>WebVTT Node Object's applicable
 classes</a>.</p></li> <!-- ok, this isn't especially well-defined, but the Selectors spec doesn't
 really give one much to go on here. -->

 <li><p>For the purposes of the '':lang()'' pseudo-class, <a lt="WebVTT Internal Node Object">WebVTT
 Internal Node Objects</a> have the language described as the <a>WebVTT Node Object's applicable
 language</a>.</p></li>

 <li><p>For the purposes of ID selector matching, <a lt="list of WebVTT Node Objects">lists of
 WebVTT Node Objects</a> have the ID given by the cue's <a>text track cue identifier</a>, if
 any.</p></li>

</ul>

<p>The following properties apply to the ''::cue()'' pseudo-element with an argument:</p>

<ul class="brief">
 <li>'color'</li>
 <li>'opacity'</li>
 <li>'visibility'</li>
 <li>the properties corresponding to the 'text-decoration' shorthand</li>
 <li>'text-shadow'</li>
 <li>the properties corresponding to the 'background' shorthand</li>
 <li>the properties corresponding to the 'outline' shorthand</li>
 <li>properties relating to the transition and animation features</li>
 <!-- add more... -->
 <!-- but definitely not anything that affects dimensions of boxes, e.g. the 'font' shorthand's
 properties or 'white-space'; those are listed below instead -->
</ul>

<!--v2 Would be nice to support transitions that are directional, e.g. changing text fill colour or
shadow size of the start of a segment when the segment becomes "past", and having the change
propagate towards the end of the segment so that it reaches the end of the segment when the next
segment becomes "past". -->

<p>In addition, the following properties apply to the ''::cue()'' pseudo-element with an argument
when the selector does not contain the '':past'' and '':future'' pseudo-classes:</p>

<ul class="brief">
 <li>the properties corresponding to the 'font' shorthand, including 'line-height'</li>
 <li>'white-space'</li>
 <li>'text-combine-upright'</li>
 <li>'ruby-position'</li>
 <!-- add more... -->
 <!-- definitely not: display, float, position, top, left, right, bottom, width, height, margin-top,
 margin-bottom, margin-left, margin-right, clip, clear, content, cursor, direction, max-height,
 min-height, max-width, min-width, orphans, overflow, page-break-*, text-align, unicode-bidi,
 widows, z-index -->
</ul>

<p>Properties that do not apply must be ignored.</p>

<p>As a special exception, the properties corresponding to the 'background' shorthand, when they
would have been applied to the <a>list of WebVTT Node Objects</a>, must instead be applied to the
<a>WebVTT cue background box</a>.</p>


<h4 id=the-past-and-future-pseudo-classes>The '':past'' and '':future'' pseudo-classes</h4>

<p>The '':past'' and '':future'' pseudo-classes sometimes match <a lt="WebVTT Node Object">WebVTT
Node Objects</a>. [[!SELECTORS4]]</p>

<p>The <dfn selector noexport>:past</dfn> pseudo-class only matches <a lt="WebVTT Node Object">WebVTT Node Objects</a>
that are <i>in the past</i>.</p>

<p algorithm="in the past">A <a>WebVTT Node Object</a> |c| is <dfn abstract-op>in the past</dfn> if, in a
pre-order, depth-first traversal of the <a>WebVTT cue</a>'s <a>list of WebVTT Node Objects</a>,
there exists a <a>WebVTT Timestamp Object</a> whose value is less than the <a>current playback
position</a> of the <a>media element</a> that is the <i>matched element</i>, entirely after the
<a>WebVTT Node Object</a> |c|.</p>

<p>The <dfn selector noexport>:future</dfn> pseudo-class only matches <a lt="WebVTT Node Object">WebVTT Node
Objects</a> that are <i>in the future</i>.</p>

<p algorithm="in the future">A <a>WebVTT Node Object</a> |c| is <dfn abstract-op>in the future</dfn> if, in a
pre-order, depth-first traversal of the <a>WebVTT cue</a>'s <a>list of WebVTT Node Objects</a>,
there exists a <a>WebVTT Timestamp Object</a> whose value is greater than the <a>current playback
position</a> of the <a>media element</a> that is the <i>matched element</i>, entirely before the
<a>WebVTT Node Object</a> |c|.</p>


<h4 id=the-cue-region-pseudo-element>The ''::cue-region'' pseudo-element</h4>

<p>Pseudo-elements apply to elements that are matched by selectors. For the purpose of this section,
that element is the matched element. The pseudo-element defined below affects the styling of text
track regions that are being rendered for the matched element.</p>

<p class="note">If the matched element is not a video element, the pseudo-element defined below
won't have any effect according to this specification.</p>

<p>The <dfn selector>::cue-region</dfn> pseudo-element (with no argument) matches any list of <a lt="WebVTT
region object">WebVTT region objects</a> constructed for the <i>matched element</i>.</p>


<p>The <dfn selector>::cue-region(|selector|)</dfn> pseudo-element with an argument must have an argument that
consists of a CSS selector [[!SELECTORS4]]. It matches any list of <a lt="WebVTT region
object">WebVTT region objects</a> constructed for the <i>matched element</i> that also matches the
given CSS selector as follows:</p>

<ul>
 <li><p>Any region (list of <a>WebVTT region objects</a>) with a <a>WebVTT region identifier</a>
 matching the given ID.</p></li>
</ul>

<p>No other selector matching is defined for ''::cue-region(|selector|)''.</p>

<p>The same properties that apply to ''::cue'' apply to the ''::cue-region'' pseudo-element; other
properties set on the pseudo-element must be ignored.</p>

<p>When a user agent is rendering one or more text track regions according to the <a>rules for
updating the display of WebVTT text tracks</a>, <a lt="WebVTT region object">WebVTT region
objects</a> used in the rendering can be matched by the above pseudo-element. User agents that
support the pseudo-element must dynamically update renderings accordingly. When either 'white-space'
or one of the properties corresponding to the 'font' shorthand (including 'line-height') changes
value, then the <a>text track cue display state</a> of all the <a lt="WebVTT cue">WebVTT cues</a> in
the region must be emptied and the <a>text track</a>'s <a>rules for updating the text track
rendering</a> must be immediately rerun.</p>


<h2 id=api>API</h2>


<h3 id=the-vttcue-interface algorithm>The {{VTTCue}} interface</h3>

<p>The following interface is used to expose WebVTT cues in the DOM API:</p>

<pre class="idl">
enum AutoKeyword { "auto" };
typedef (double or AutoKeyword) LineAndPositionSetting;
enum DirectionSetting { "" /* horizontal */, "rl", "lr" };
enum LineAlignSetting { "start", "center", "end" };
enum PositionAlignSetting { "line-left", "center", "line-right", "auto" };
enum AlignSetting { "start", "center", "end", "left", "right" };
[Exposed=Window]
interface VTTCue : TextTrackCue {
  constructor(double startTime, unrestricted double endTime, DOMString text);
  attribute VTTRegion? region;
  attribute DirectionSetting vertical;
  attribute boolean snapToLines;
  attribute LineAndPositionSetting line;
  attribute LineAlignSetting lineAlign;
  attribute LineAndPositionSetting position;
  attribute PositionAlignSetting positionAlign;
  attribute double size;
  attribute AlignSetting align;
  attribute DOMString text;
  DocumentFragment getCueAsHTML();
};
</pre>

<dl class="note">

 <dt>|cue| = new <a constructor lt="VTTCue(startTime, endTime, text)">VTTCue</a>( |startTime|,
 |endTime|, |text| )</dt>
 <dd>
  <p>Returns a new {{VTTCue}} object, for use with the {{TextTrack/addCue()}} method.</p>
  <p>The |startTime| argument sets the <a>text track cue start time</a>.</p>
  <p>The |endTime| argument sets the <a>text track cue end time</a>. In the case of the value being
  positive Infinity, the {{VTTCue}} object represents an <a>unbounded text track cue</a>.</p>
  <p>The |text| argument sets the <a>cue text</a>.</p>
 </dd>

 <dt>|cue| . {{VTTCue/region}}</dt>
 <dd>
  <p>Returns the {{VTTRegion}} object to which this cue belongs, if any, or null otherwise.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/vertical}} [ = |value| ]</dt>
 <dd>
  <p>Returns a string representing the <a>WebVTT cue writing direction</a>, as follows:</p>
  <dl class="switch">
   <dt>If it is <a lt="WebVTT cue horizontal writing direction">horizontal</a></dt>
   <dd><p>The empty string.</p></dd>
   <dt>If it is <a lt="WebVTT cue vertical growing left writing direction">vertical growing
   left</a></dt>
   <dd><p>The string "<code>rl</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue vertical growing right writing direction">vertical growing
   right</a></dt>
   <dd><p>The string "<code>lr</code>".</p></dd>
  </dl>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/snapToLines}} [ = |value| ]</dt>
 <dd>
  <p>Returns true if the <a>WebVTT cue snap-to-lines flag</a> is true, false otherwise.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/line}} [ = |value| ]</dt>
 <dd>
  <p>Returns the <a>WebVTT cue line</a>. In the case of the value being <a lt="WebVTT cue line
  automatic">auto</a>, the string "<code>auto</code>" is returned.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/lineAlign}} [ = |value| ]</dt>
 <dd>
  <p>Returns a string representing the <a>WebVTT cue line alignment</a>, as follows:</p>
  <dl class="switch">
   <dt>If it is <a lt="WebVTT cue line start alignment">start alignment</a></dt>
   <dd><p>The string "<code>start</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue line center alignment">center alignment</a></dt>
   <dd><p>The string "<code>center</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue line end alignment">end alignment</a></dt>
   <dd><p>The string "<code>end</code>".</p></dd>
  </dl>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/position}} [ = |value| ]</dt>
 <dd>
  <p>Returns the <a>WebVTT cue position</a>. In the case of the value being <a lt="WebVTT cue
  automatic position">auto</a>, the string "<code>auto</code>" is returned.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/positionAlign}} [ = |value| ]</dt>
 <dd>
  <p>Returns a string representing the <a>WebVTT cue position alignment</a>, as follows:</p>
  <dl class="switch">
   <dt>If it is <a lt="WebVTT cue position line-left alignment">line-left alignment</a></dt>
   <dd><p>The string "<code>line-left</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue position center alignment">center alignment</a></dt>
   <dd><p>The string "<code>center</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue position line-right alignment">line-right alignment</a></dt>
   <dd><p>The string "<code>line-right</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue position automatic alignment">automatic alignment</a></dt>
   <dd><p>The string "<code>auto</code>".</p></dd>
  </dl>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/size}} [ = |value| ]</dt>
 <dd>
  <p>Returns the <a>WebVTT cue size</a>.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/align}} [ = |value| ]</dt>
 <dd>
  <p>Returns a string representing the <a>WebVTT cue text alignment</a>, as follows:</p>
  <dl class="switch">
   <dt>If it is <a lt="WebVTT cue start alignment">start alignment</a></dt>
   <dd><p>The string "<code>start</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue center alignment">center alignment</a></dt>
   <dd><p>The string "<code>center</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue end alignment">end alignment</a></dt>
   <dd><p>The string "<code>end</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue left alignment">left alignment</a></dt>
   <dd><p>The string "<code>left</code>".</p></dd>
   <dt>If it is <a lt="WebVTT cue right alignment">right alignment</a></dt>
   <dd><p>The string "<code>right</code>".</p></dd>
  </dl>
  <p>Can be set.</p>
 </dd>

 <dt>|cue| . {{VTTCue/text}} [ = |value| ]</dt>
 <dd>
  <p>Returns the <a>cue text</a> in raw unparsed form.</p>
  <p>Can be set.</p>
 </dd>

 <dt>|fragment| = |cue| . <a method lt="getCueAsHTML()">getCueAsHTML</a>()</dt>
 <dd>
  <p>Returns the <a>cue text</a> as a {{DocumentFragment}} of <a>HTML elements</a> and other DOM
  nodes.</p>
 </dd>

</dl>

<p>The <dfn constructor for=VTTCue lt="VTTCue(startTime, endTime, text)">VTTCue(|startTime|,
|endTime|, |text|)</dfn> constructor, when invoked, must run the following steps:</p>

<ol algorithm="VTTCue construction">

 <li><p>Create a new <a>WebVTT cue</a>. Let |cue| be that <a>WebVTT cue</a>.</p></li>

 <li><p>Let |cue|'s <a>text track cue start time</a> be the value of the |startTime|
 argument.</p></li>

 <li><p>If the value of the |endTime| argument is negative Infinity or a Not-a-Number (NaN) value,
 then throw a <a
 href="https://tc39.es/ecma262/#sec-native-error-types-used-in-this-standard-typeerror">TypeError
 </a> exception. Otherwise, let |cue|'s <a>text track cue end time</a> be the value of the |endTime|
 argument.</p></li>

 <li><p>Let |cue|'s <a>cue text</a> be the value of the |text| argument, and let the <a>rules for
 extracting the chapter title</a> be the <a>WebVTT rules for extracting the chapter
 title</a>.</p></li>

 <!-- default settings -->

 <li><p>Let |cue|'s <a>text track cue identifier</a> be the empty string.</p></li>

 <li><p>Let |cue|'s <a>text track cue pause-on-exit flag</a> be false.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue region</a> be null.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue writing direction</a> be <a lt="WebVTT cue horizontal writing
 direction">horizontal</a>.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue snap-to-lines flag</a> be true.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue line</a> be <a lt="WebVTT cue line automatic">auto</a>.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue line alignment</a> be <a lt="WebVTT cue line start
 alignment">start alignment</a>.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue position</a> be <a lt="WebVTT cue automatic
 position">auto</a>.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue position alignment</a> be <a lt="WebVTT cue position automatic
 alignment">auto</a>.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue size</a> be 100.</p></li>

 <li><p>Let |cue|'s <a>WebVTT cue text alignment</a> be <a lt="WebVTT cue center alignment">center
 alignment</a>.</p></li>

 <li><p>Return the {{VTTCue}} object representing |cue|.</p></li>

</ol>

<p>The <dfn attribute for=VTTCue>region</dfn> attribute, on getting, must return the {{VTTRegion}}
object representing the <a>WebVTT cue region</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object
represents, if any; or null otherwise. On setting, the <a>WebVTT cue region</a> must be set to the
new value.</p>

<p>The <dfn attribute for=VTTCue>vertical</dfn> attribute, on getting, must return the string from
the second cell of the row in the table below whose first cell is the <a>WebVTT cue writing
direction</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT cue writing direction</a></th>
   <th>{{VTTCue/vertical}} value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT cue horizontal writing direction">Horizontal</a></td>
   <td>"<code></code>" (the empty string)</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue vertical growing left writing direction">Vertical growing left</a></td>
   <td>"<code>rl</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue vertical growing right writing direction">Vertical growing right</a></td>
   <td>"<code>lr</code>"</td>
  </tr>
 </tbody>
</table>

<p>On setting, the <a>WebVTT cue writing direction</a> must be set to the value given in the first
cell of the row in the table above whose second cell is a <a>case-sensitive</a> match for the new
value.</p>

<p>The <dfn attribute for=VTTCue>snapToLines</dfn> attribute, on getting, must return true if the
<a>WebVTT cue snap-to-lines flag</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents
is true; or false otherwise. On setting, the <a>WebVTT cue snap-to-lines flag</a> must be set to the
new value.</p>

<p>The <dfn attribute for=VTTCue>line</dfn> attribute, on getting, must return the <a>WebVTT cue
line</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents. The special value <a
lt="WebVTT cue line automatic">auto</a> must be represented as the string "<code>auto</code>". On
setting, the <a>WebVTT cue line</a> must be set to the new value; if the new value is the string
"<code>auto</code>", then it must be interpreted as the special value <a lt="WebVTT cue line
automatic">auto</a>.</p>

<p class="note">In order to be able to set the {{VTTCue/snapToLines}} and {{VTTCue/line}} attributes
in any order, the API does not reject setting {{VTTCue/snapToLines}} to false when {{VTTCue/line}}
has a value outside the range 0..100, or vice versa.</p>

<p>The <dfn attribute for=VTTCue>lineAlign</dfn> attribute, on getting, must return the string from
the second cell of the row in the table below whose first cell is the <a>WebVTT cue line
alignment</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT cue line alignment</a></th>
   <th>{{VTTCue/lineAlign}} value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT cue line start alignment">Start alignment</a></td>
   <td>"<code>start</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue line center alignment">Center alignment</a></td>
   <td>"<code>center</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue line end alignment">End alignment</a></td>
   <td>"<code>end</code>"</td>
  </tr>
 </tbody>
</table>

<p>On setting, the <a>WebVTT cue line alignment</a> must be set to the value given in the first cell
of the row in the table above whose second cell is a <a>case-sensitive</a> match for the new
value.</p>

<p>The <dfn attribute for=VTTCue>position</dfn> attribute, on getting, must return the <a>WebVTT cue
position</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents. The special value <a
lt="WebVTT cue automatic position">auto</a> must be represented as the string "<code>auto</code>".
On setting, if the new value is negative or greater than 100, then an {{IndexSizeError}} exception
must be thrown. Otherwise, the <a>WebVTT cue position</a> must be set to the new value; if the new
value is the string "<code>auto</code>", then it must be interpreted as the special value <a
lt="WebVTT cue automatic position">auto</a>.</p>

<p>The <dfn attribute for=VTTCue>positionAlign</dfn> attribute, on getting, must return the string
from the second cell of the row in the table below whose first cell is the <a>WebVTT cue position
alignment</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT cue position alignment</a></th>
   <th>{{VTTCue/positionAlign}} value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT cue position line-left alignment">Line-left alignment</a></td>
   <td>"<code>line-left</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue position center alignment">Center alignment</a></td>
   <td>"<code>center</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue position line-right alignment">Line-right alignment</a></td>
   <td>"<code>line-right</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue position automatic alignment">Automatic alignment</a></td>
   <td>"<code>auto</code>"</td>
  </tr>
 </tbody>
</table>

<p>On setting, the <a>WebVTT cue position alignment</a> must be set to the value given in the first
cell of the row in the table above whose second cell is a <a>case-sensitive</a> match for the new
value.</p>

<p>The <dfn attribute for=VTTCue>size</dfn> attribute, on getting, must return the <a>WebVTT cue
size</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents. On setting, if the new
value is negative or greater than 100, then an {{IndexSizeError}} exception must be thrown.
Otherwise, the <a>WebVTT cue size</a> must be set to the new value.</p>

<p>The <dfn attribute for=VTTCue>align</dfn> attribute, on getting, must return the string from the
second cell of the row in the table below whose first cell is the <a>WebVTT cue text alignment</a>
of the <a>WebVTT cue</a> that the {{VTTCue}} object represents:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT cue text alignment</a></th>
   <th>{{VTTCue/align}} value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT cue start alignment">Start alignment</a></td>
   <td>"<code>start</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue center alignment">Center alignment</a></td>
   <td>"<code>center</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue end alignment">End alignment</a></td>
   <td>"<code>end</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue left alignment">Left alignment</a></td>
   <td>"<code>left</code>"</td>
  </tr>
  <tr>
   <td><a lt="WebVTT cue right alignment">Right alignment</a></td>
   <td>"<code>right</code>"</td>
  </tr>
 </tbody>
</table>

<p>On setting, the <a>WebVTT cue text alignment</a> must be set to the value given in the first cell
of the row in the table above whose second cell is a <a>case-sensitive</a> match for the new
value.</p>

<p>The <dfn attribute for=VTTCue>text</dfn> attribute, on getting, must return the raw <a>cue
text</a> of the <a>WebVTT cue</a> that the {{VTTCue}} object represents. On setting, the <a>cue
text</a> must be set to the new value.</p>

<p>The <dfn method for=VTTCue>getCueAsHTML()</dfn> method must convert the <a>cue text</a> to a
{{DocumentFragment}} for the <a>responsible document</a> specified by the <a>entry settings
object</a> by applying the <a>WebVTT cue text DOM construction rules</a> to the result of applying
the <a>WebVTT cue text parsing rules</a> to the <a>cue text</a>.</p>

<p class=note>A fallback language is not provided for {{VTTCue/getCueAsHTML()}} since a
{{DocumentFragment}} cannot expose language information.</p>


<h3 id=the-vttregion-interface algorithm>The {{VTTRegion}} interface</h3>

<p>The following interface is used to expose WebVTT regions in the DOM API:</p>

<pre class="idl">
enum ScrollSetting { "" /* none */, "up" };
[Exposed=Window]
interface VTTRegion {
  constructor();
  attribute DOMString id;
  attribute double width;
  attribute unsigned long lines;
  attribute double regionAnchorX;
  attribute double regionAnchorY;
  attribute double viewportAnchorX;
  attribute double viewportAnchorY;
  attribute ScrollSetting scroll;
};
</pre>

<dl class="note">

 <dt>|region| = new <a constructor lt="VTTRegion()">VTTRegion</a>()</dt>
 <dd>
  <p>Returns a new {{VTTRegion}} object.</p>
 </dd>

 <dt>|region| . {{VTTRegion/id}}</dt>
 <dd>
  <p>Returns the text track region identifier. Can be set.</p>
 </dd>

 <dt>|region| . {{VTTRegion/width}}</dt>
 <dd>
  <p>Returns the WebVTT region width as a percentage of the video width. Can be set. Throws an
  {{IndexSizeError}} if the new value is not in the range 0..100.</p>
 </dd>

 <dt>|region| . {{VTTRegion/lines}}</dt>
 <dd>
  <p>Returns the text track region height as a number of lines. Can be set. Throws an
  {{IndexSizeError}} if the new value is negative.</p>
 </dd>

 <dt>|region| . {{VTTRegion/regionAnchorX}}</dt>
 <dd>
  <p>Returns the WebVTT region anchor X offset as a percentage of the region width. Can be set.
  Throws an {{IndexSizeError}} if the new value is not in the range 0..100.</p>
 </dd>

 <dt>|region| . {{VTTRegion/regionAnchorY}}</dt>
 <dd>
  <p>Returns the WebVTT region anchor Y offset as a percentage of the region height. Can be set.
  Throws an {{IndexSizeError}} if the new value is not in the range 0..100.</p>
 </dd>

 <dt>|region| . {{VTTRegion/viewportAnchorX}}</dt>
 <dd>
  <p>Returns the WebVTT region viewport anchor X offset as a percentage of the video width. Can be
  set. Throws an {{IndexSizeError}} if the new value is not in the range 0..100.</p>
 </dd>

 <dt>|region| . {{VTTRegion/viewportAnchorY}}</dt>
 <dd>
  <p>Returns the WebVTT region viewport anchor Y offset as a percentage of the video height. Can be
  set. Throws an {{IndexSizeError}} if the new value is not in the range 0..100.</p>
 </dd>

 <dt>|region| . {{VTTRegion/scroll}}</dt>
 <dd>
  <p>Returns a string representing the <a>WebVTT region scroll</a> as follows:</p>
  <dl class="switch">
   <dt>If it is unset</dt>
   <dd><p>The empty string.</p></dd>
   <dt>If it is up</dt>
   <dd><p>The string "<code>up</code>".</p></dd>
  </dl>
  <p>Can be set.</p>
 </dd>
</dl>

<p>The <dfn constructor for=VTTRegion>VTTRegion()</dfn> constructor, when invoked, must run the
following steps:</p>

<ol algorithm="VTTRegion construction">
 <li><p>Create a new <a>WebVTT region</a>. Let |region| be that <a>WebVTT region</a>.</p></li>

 <!-- default settings -->
 <li><p>Let |region|'s <a>WebVTT region identifier</a> be the empty string.</p></li>

 <li><p>Let |region|'s <a>WebVTT region width</a> be 100.</p></li>

 <li><p>Let |region|'s <a>WebVTT region lines</a> be 3.</p></li>

 <li><p>Let |region|'s <a lt="WebVTT region anchor">text track region regionAnchorX</a> be
 0.</p></li>

 <li><p>Let |region|'s <a lt="WebVTT region anchor">text track region regionAnchorY</a> be
 100.</p></li>

 <li><p>Let |region|'s <a lt="WebVTT region viewport anchor">text track region viewportAnchorX</a>
 be 0.</p></li>

 <li><p>Let |region|'s <a lt="WebVTT region viewport anchor">text track region viewportAnchorY</a>
 be 100.</p></li>

 <li><p>Let |region|'s <a>WebVTT region scroll</a> be the empty string.</p></li>

 <li><p>Return the {{VTTRegion}} object representing |region|.</p></li>
</ol>


<p>The <dfn attribute for=VTTRegion>id</dfn> attribute, on getting, must return the <a>WebVTT region
identifier</a> of the <a>WebVTT region</a> that the {{VTTRegion}} object represents. On setting, the
<a>WebVTT region identifier</a> must be set to the new value.</p>

<p>The <dfn attribute for=VTTRegion>width</dfn> attribute, on getting, must return the <a>WebVTT
region width</a> of the <a>WebVTT region</a> that the {{VTTRegion}} object represents. On setting,
if the new value is negative or greater than 100, then an {{IndexSizeError}} exception must be
thrown. Otherwise, the <a>WebVTT region width</a> must be set to the new value.</p>

<p>The <dfn attribute for=VTTRegion>lines</dfn> attribute, on getting, must return the <a>WebVTT
region lines</a> of the <a>WebVTT region</a> that the {{VTTRegion}} object represents. On setting,
the <a>WebVTT region lines</a> must be set to the new value.</p>

<p>The <dfn attribute for=VTTRegion>regionAnchorX</dfn> attribute, on getting, must return the
<a>WebVTT region anchor</a> X offset of the <a>WebVTT region</a> that the {{VTTRegion}} object
represents. On setting, if the new value is negative or greater than 100, then an {{IndexSizeError}}
exception must be thrown. Otherwise, the <a>WebVTT region anchor</a> X distance must be set to the
new value.</p>

<p>The <dfn attribute for=VTTRegion>regionAnchorY</dfn> attribute, on getting, must return the
<a>WebVTT region anchor</a> Y offset of the <a>WebVTT region</a> that the {{VTTRegion}} object
represents. On setting, if the new value is negative or greater than 100, then an {{IndexSizeError}}
exception must be thrown. Otherwise, the <a>WebVTT region anchor</a> Y distance must be set to the
new value.</p>

<p>The <dfn attribute for=VTTRegion>viewportAnchorX</dfn> attribute, on getting, must return the
<a>WebVTT region viewport anchor</a> X offset of the <a>WebVTT region</a> that the {{VTTRegion}}
object represents. On setting, if the new value is negative or greater than 100, then an
{{IndexSizeError}} exception must be thrown. Otherwise, the <a>WebVTT region viewport anchor</a> X
distance must be set to the new value.</p>

<p>The <dfn attribute for=VTTRegion>viewportAnchorY</dfn> attribute, on getting, must return the
<a>WebVTT region viewport anchor</a> Y offset of the <a>WebVTT region</a> that the {{VTTRegion}}
object represents. On setting, if the new value is negative or greater than 100, then an
{{IndexSizeError}} exception must be thrown. Otherwise, the <a>WebVTT region viewport anchor</a> Y
distance must be set to the new value.</p>

<p>The <dfn attribute for=VTTRegion>scroll</dfn> attribute, on getting, must return the string from
the second cell of the row in the table below whose first cell is the <a>WebVTT region scroll</a>
setting of the <a>WebVTT region</a> that the {{VTTRegion}} object represents:</p>

<table class="complex data">
 <thead>
  <tr>
   <th><a>WebVTT region scroll</a></th>
   <th>{{VTTRegion/scroll}} value</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td><a lt="WebVTT region scroll none">None</a></td>
   <td>"<code></code>" (the empty string)</td>
  </tr>
  <tr>
   <td><a lt="WebVTT region scroll up">Up</a></td>
   <td>"<code>up</code>"</td>
  </tr>
 </tbody>
</table>

<p>On setting, the <a>WebVTT region scroll</a> must be set to the value given on the first cell of
the row in the table above whose second cell is a <a>case-sensitive</a> match for the new value.</p>


<h2 id=iana>IANA considerations</h2>
<!-- http://www.w3.org/2002/06/registering-mediatype.html -->


<h3 id=iana-text-vtt><dfn><code>text/vtt</code></dfn></h3>

<p>This registration is for community review and will be submitted to the IESG for review, approval,
and registration with IANA.</p>

<!-- To: ietf-types@iana.org Subject: Registration of media type text/vtt -->

<dl>
 <dt>Type name:</dt>
 <dd>text</dd>
 <dt>Subtype name:</dt>
 <dd>vtt</dd>
 <dt>Required parameters:</dt>
 <dd>N/A</dd>
 <dt>Optional parameters:</dt>
 <dd>N/A</dd>
 <dt>Encoding considerations:</dt>
 <dd>8bit (always UTF-8)</dd>
 <dt>Security considerations:</dt>
 <dd>
  <p>Text track files themselves pose no immediate risk unless sensitive information is included
  within the data. Implementations, however, are required to follow specific rules when processing
  text tracks, to ensure that certain origin-based restrictions are honored. Failure to correctly
  implement these rules can result in information leakage, cross-site scripting attacks, and the
  like.</p>
 </dd>
 <dt>Interoperability considerations:</dt>
 <dd>
  <p>Rules for processing both conforming and non-conforming content are defined in this
  specification.</p>
 </dd>
 <dt>Published specification:</dt>
 <dd>
  This document is the relevant specification.
 </dd>
 <dt>Applications that use this media type:</dt>
 <dd>
  Web browsers and other video players.
 </dd>
 <dt>Additional information:</dt>
 <dd>
  <dl>
   <dt>Magic number(s):</dt>
   <dd>
    <p>WebVTT files all begin with one of the following byte sequences (where "EOF" means the end of
    the file):</p>
    <ul class="brief">
     <li> EF BB BF 57 45 42 56 54 54 0A </li>
     <li> EF BB BF 57 45 42 56 54 54 0D </li>
     <li> EF BB BF 57 45 42 56 54 54 20 </li>
     <li> EF BB BF 57 45 42 56 54 54 09 </li>
     <li> EF BB BF 57 45 42 56 54 54 EOF </li>
     <li> 57 45 42 56 54 54 0A </li>
     <li> 57 45 42 56 54 54 0D </li>
     <li> 57 45 42 56 54 54 20 </li>
     <li> 57 45 42 56 54 54 09 </li>
     <li> 57 45 42 56 54 54 EOF </li>
    </ul>
    <p class="note">(An optional UTF-8 BOM, the ASCII string "<code>WEBVTT</code>", and finally a
    space, tab, line break, or the end of the file.)</p>
   </dd>
   <dt>File extension(s):</dt>
   <dd>"<code>vtt</code>"</dd>
   <dt>Macintosh file type code(s):</dt>
   <dd>No specific Macintosh file type codes are recommended for this type.</dd>
  </dl>
 </dd>
 <dt>Person &amp; email address to contact for further information:</dt>
 <dd>Silvia Pfeiffer &lt;silviapfeiffer1@gmail.com></dd>
 <dt>Intended usage:</dt>
 <dd>Common</dd>
 <dt>Restrictions on usage:</dt>
 <dd>No restrictions apply.</dd>
 <dt>Authors:</dt>
 <dd>Silvia Pfeiffer &lt;silviapfeiffer1@gmail.com>, Simon Pieters &lt;simonp@opera.com>, Philip
 J&auml;genstedt &lt;philipj@opera.com>, Ian Hickson &lt;ian@hixie.ch></dd>
 <dt>Change controller:</dt>
 <dd>W3C</dd>
</dl>

<p>Fragment identifiers have no meaning with <code>text/vtt</code> resources.</p>


<h2 class="no-num" id=privacy-and-security-considerations>Privacy and Security Considerations</h2>


<h3 id=fromat-security>Text-based format security</h3>

<p>As with any text-based format, it is possible to construct malicious content that might cause
buffer over-runs, value overflows (e.g. string representations of integers that overflow a given
word length), and the like. Implementers should take care in implementing a parser that over-long
lines, field values, or encoded values do not cause security problems.</p>

<h3 id=styling-security>Styling-related privacy and security</h3>

<p>WebVTT can embed CSS style sheets, which will be applied in user agents that support CSS. Under
these circumstances, the privacy and security considerations of CSS apply, with the following
caveats.</p>

<p>Such style sheets <a href="#style-no-external-resources">cannot fetch any external resources</a>,
and it is important for privacy that user agents do not allow this. Otherwise, WebVTT files could be
authored such that a third party is notified when the user watches a particular video, and even the
current time in that video.</p>

<p>It is possible for a user agent to offer user style sheets, but their presence and nature will
not be detectable by scripts running in the same user agent (e.g. browser) since the CSS object
model for such style sheets is not exposed to script and there is no way to get the computed style
for pseudo-elements other than ''::before'' and ''::after'' with the {{Window/getComputedStyle()}}
API. [[!CSSOM]]</p>

<h3 id=scripting-security>Scripting-related security</h3>

<p>WebVTT does not include or enable scripting. It is important that user agents do not support a
way to execute script embedded in a WebVTT file.</p>

<p>However, it is possible to construct and deliver a file that is designed not to present captions
or subtitles, but instead to provide timed input (‘triggers’) to a script system. A
poorly-written script or script system might then cause security, privacy or other problems;
however, this consideration really applies to the script system. Since WebVTT supplies these
triggers at their timestamps, a malicious file might present such triggers very rapidly, perhaps
causing undue resource consumption.</p>

<h3 id=privacy-of-preference>Privacy of preference</h3>

<p>A user agent that selects, and causes to download or interpret a WebVTT file, might indicate to
the origin server that the user has a need for captions or subtitles, and also the <a lt="honor user
preferences for automatic text track selection">language preference</a> of the user for captions or
subtitles. That is a (small) piece of information about the user. However, the offering of a caption
file, and the choice whether to retrieve and consume it, are really characteristics of the format or
protocol which does the offer (e.g. the HTML element), rather than of the caption format itself.
[[!HTML]]</p>


<h2 class="no-num" id=acknowledgements>Acknowledgements</h2>

<p>Thanks to the SubRip community, including in particular Zuggy and ai4spam, for their work on the
SubRip software program whose SRT file format was used as the basis for the WebVTT text track file
format.</p>

<p>Thanks to Ian Hickson and many others for their work on the HTML standard, where WebVTT was
originally specified. [[!HTML]]</p>

<p>
 Thanks to
 Addison Phillips,
 Alastor Wu,
 Andreas Tai,
 Anna Cavender,
 Anne van Kesteren,
 Benjamin Schaaf,
 Brian Quass,
 Caitlin Potter,
 Courtney Kennedy,
 Cyril Concolato,
 Dae Kim,
 David Singer,
 Eric Carlson,
 fantasai,
 Frank Olivier,
 Fredrik Söderquist,
 Giuseppe Pascale,
 Glenn Adams,
 Glenn Maynard,
 John Foliot,
 Kyle Huey,
 Lawrence Forooghian,
 Loretta Guarino Reid,
 Ms2ger,
 Nigel Megitt,
 Ralph Giles,
 Richard Ishida,
 Rick Eyre,
 Ronny Mennerich,
 Theresa O'Connor, and
 Victor Cărbune
 for their useful comments.
</p>